Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread David Daney
Dominique Dhumieres wrote:
> On powerpc-apple-darwin8, I killed jc1 after it took over 37:29.81 at:
>
> ...
> echo
../../../../gcc-4.3-work/libjava/classpath/lib/gnu/javax/swing/text/html/parser/HTML_401F*.class>
gnu/javax/swing/text/html/parser/HTML_401F.list
> /bin/sh ./libtool --tag=GCJ --mode=compile
/opt/gcc/darwin_buildw/gcc/gcj
-B/opt/gcc/darwin_buildw/powerpc-apple-darwin8/ppc64/libjava/
-B/opt/gcc/darwin_buildw/gcc/ -fclasspath=
-fbootclasspath=../../../../gcc-4.3-work/libjava/classpath/lib
--encoding=UTF-8 -Wno-deprecated -fbootstrap-classes -g -O2  -m64 -c -o
gnu/javax/swing/text/html/parser/HTML_401F.lo
-fsource-filename=/opt/gcc/darwin_buildw/powerpc-apple-darwin8/ppc64/libjava/classpath/lib/classes
-MT gnu/javax/swing/text/html/parser/HTML_401F.lo -MD -MP -MF
gnu/javax/swing/text/html/parser/HTML_401F.deps
@gnu/javax/swing/text/html/parser/HTML_401F.list
> libtool: compile:  /opt/gcc/darwin_buildw/gcc/gcj
-B/opt/gcc/darwin_buildw/powerpc-apple-darwin8/ppc64/libjava/
-B/opt/gcc/darwin_buildw/gcc/ -fclasspath=
-fbootclasspath=../../../../gcc-4.3-work/libjava/classpath/lib
--encoding=UTF-8 -Wno-deprecated -fbootstrap-classes -g -O2 -m64 -c
-fsource-filename=/opt/gcc/darwin_buildw/powerpc-apple-darwin8/ppc64/libjava/classpath/lib/classes
-MT gnu/javax/swing/text/html/parser/HTML_401F.lo -MD -MP -MF
gnu/javax/swing/text/html/parser/HTML_401F.deps
@gnu/javax/swing/text/html/parser/HTML_401F.list  -fno-common -o
gnu/javax/swing/text/html/parser/.libs/HTML_401F.o
>
> Was I pessimistic or is there a bug?
>

You need at least 256MB of memory to compile HTML_401F.  A lot of time
is also useful.  If the system is not thrashing, I would give it a
couple of hours before calling it broken.

David Daney

> TIA
>
> Dominique



Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Dominique Dhumieres
David,

> ... I would give it a couple of hours before calling it broken.

You are right, a small "couple" of hours is need for the three stages: slightly
less than two hours on my machine (1.8Ghz G5). I never noticed that this part
was so long and I was too eager to do something else with the CPU. Sorry for the
noise.

Thanks for the answer.

Dominique


Re: -fno-tree-cselim not working?

2007-10-28 Thread Andi Kleen
On Fri, Oct 26, 2007 at 01:23:15PM -0700, Ian Lance Taylor wrote:
> Andi Kleen <[EMAIL PROTECTED]> writes:
> 
> > Ian Lance Taylor <[EMAIL PROTECTED]> writes:
> > >
> > > This code isn't going to be a problem, because spin_unlock presumably
> > > includes a memory barrier.
> > 
> > At least in the Linux kernel and also in glibc for mutexes locks are just 
> > plain
> > function calls, which are not necessarily full memory barriers.
> 
> True, and problematic in some cases--but a function call which gcc
> can't see is a memory barrier for all addressable memory.

I constructed a test case now to show why the optimization is a bad 
idea in general. It just essentially measures how much it costs
to do the access on a cache cold variable. On a Core2 this is about 

% gcc -o tstore tstore.c 
% ./tstore 
209 cycles
% gcc -O2 -o tstore tstore.c 
% ./tstore 
671 cycles

It runs about 3x faster without optimization (no if conversion of
variable++) than without because of the cache miss.

Your patch would fix it too because it uses a function call, but
it might not in the general case when the condition happens to be 
not a function call.

-Andi

(x86 specific, but can be adapted to other architectures)


#include 
#include 

int GO_SLOW = 0;

#define LARGE (5*1020*1024)
int larger_than_cache[LARGE];   
int variable;

static inline unsigned long long rdtsc(void)
{
unsigned a,d;
asm volatile("rdtsc" : "=a" (a), "=d" (d));
return a | ((unsigned long long)d) << 32;
}

void test(void)
{
unsigned long start, end;
start = rdtsc();
if (go_slow())
variable++;
end = rdtsc();
printf("%Lu cycles\n", end - start);
}

int go_slow(void)
{
return GO_SLOW;
}

int main(void)
{
variable++;
memset(&larger_than_cache, 0, sizeof larger_than_cache);
go_slow();
test();
return 0;
}


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Gerald Pfeifer
On Sun, 28 Oct 2007, David Daney wrote:
> You need at least 256MB of memory to compile HTML_401F.  A lot of time
> is also useful.  If the system is not thrashing, I would give it a
> couple of hours before calling it broken.

Have we reduced memory consumption recently?  On i386-freebsd the number
I identified earlier this year was 700MB, 512MB definitely _not_ being 
sufficient.  I'd be very interested in your measurements, perhaps we can
reduce the limit somewhat!

Gerald


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Bart Van Assche
On 10/27/07, Florian Weimer <[EMAIL PROTECTED]> wrote:
>
> Anyway, not reordering across function calls is not sufficient to get
> sane threading semantics (IIRC, this is also explained in detail in Hans
> Boehm's paper).

Hello Florian,

In Hans Boehm's paper the following issues are identified:
1. Concurrent accesses of variables without explicit locking can cause
unexpected results in a multithreaded context (paragraph 4.1).
2. If non-atomic variables (e.g. one field of a bitfield) are shared
over threads, and these are not protected by explicit locking,
updating such a variable in a multithreaded context is troublesome
(paragraph 4.2).
3. If the compiler performs register promotion on a shared variable,
this can cause undesired results in a multithreaded context (paragraph
4.3)

And this thread started with:
4. If the compiler generates a store operation for an assignment
statement that is not executed, this can cause trouble in a
mulithreaded context.

My opinion is that, given the importance of multithreading, it should
be documented in the gcc manual which optimizations can cause trouble
in multithreaded software (such as (3) and (4)). It should also be
documented which compiler flags must be used to disable optimizations
that cause trouble for multithreaded software. Requiring that all
thread-shared variables should be declared volatile is completely
unacceptable. We need a solution today for the combination of C/C++
and POSIX threads, we can't wait for the respective language
standardization committees to come up with a solution.

Regarding issues (1) and (2): (1) can be addressed by using
platform-specific or compiler-specific solutions, e.g. the datatype
atomic_t provided by the Linux kernel headers. And any prudent
programmer won't write code that triggers (2).

And as you may have noted, I do not agree with Hans Boehm where he
states that the combination of C/C++ with POSIX threads cannot result
in correctly working programs. I believe that the issues raised by
Hans Boehm can be solved.

Bart Van Assche.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Robert Dewar

Bart Van Assche wrote:


My opinion is that, given the importance of multithreading, it should
be documented in the gcc manual which optimizations can cause trouble
in multithreaded software (such as (3) and (4)). It should also be
documented which compiler flags must be used to disable optimizations
that cause trouble for multithreaded software. Requiring that all
thread-shared variables should be declared volatile is completely
unacceptable. 


Why is this unacceptable .. seems much better to me than writing
undefined stuff.


And as you may have noted, I do not agree with Hans Boehm where he
states that the combination of C/C++ with POSIX threads cannot result
in correctly working programs. I believe that the issues raised by
Hans Boehm can be solved.


Well Hans is talking about C/C++, you are talking about some other
language in which programs which do not have well defined semantics
in C or C++ do have well defined semantics in your language.


Bart Van Assche.





Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Andrew Haley
Bart Van Assche writes:

 > We need a solution today for the combination of C/C++ and POSIX
 > threads, we can't wait for the respective language standardization
 > committees to come up with a solution.

And, in the proposed memory model, I believe we have one.  If there is
some reason you believe the proposed memory model won't be sufficient,
then maybe we can start looking at doing something gcc local.  But it
would surely be much better to do this through the ISO TC.  They will
doubtless be glad to respond to any concerns that you have.

Andrew.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Tomash Brechko
On Sun, Oct 28, 2007 at 09:47:36 -0400, Robert Dewar wrote:
> Bart Van Assche wrote:
> 
> >Requiring that all thread-shared variables should be declared
> >volatile is completely unacceptable.
> 
> Why is this unacceptable .. seems much better to me than writing
> undefined stuff.

There's a parallel thread in the Linux Kernel Mailing List.  Everyone
is advised to read it, if not already.  There are several good points
there:

  - the problem is not limited to multithreaded domain: the page with
the object could be made read-only during execution, thus

   if (! page_is_read_only)
 v = 1;

would SIGSEGV for no apparent reason.

  - making things volatile is unacceptable from performance POV.

  - optimization in question might well turn out to be misoptimization
for anything but microbenchmarks (read LKML for cache flush/dirty
page issues).

  - "people knowledgeable in POSIX say that this optimization is
bogus".  I would add that though we may say that Standard C is not
aware of threads, POSIX _is_ aware of Standard C.  While POSIX
failed to solve the issue by formal word, its intent is clear: to
make POSIX Threads usable.  The compiler that claims to be POSIX
compatible should take this into account.

  - there's also a good talk on lawyer-ish vs attached-to-reality
approach.  I personally doubt those who continue to advise to use
volatile are actually writing such multithreaded programs.  Most
argue just for the fun of it.


> Well Hans is talking about C/C++, you are talking about some other
> language in which programs which do not have well defined semantics
> in C or C++ do have well defined semantics in your language.

Good thing we have this _bug_ in languages that define memory
semantics (Ada, Java), and no one yet argues that GCC should be fixed
wrt to only those languages.


-- 
   Tomash Brechko


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Dave Korn
On 28 October 2007 13:32, Bart Van Assche wrote:

>  Requiring that all
> thread-shared variables should be declared volatile is completely
> unacceptable. 

  Any variable that may be altered by an external unpredictable asynchronous
'force majeure' must be declared volatile or the behaviour is undefined.  Your
code is simply incorrect, and you appear to be demanding that the language
standards and the compiler all be revised to make the buggy code valid.

> We need a solution today for the combination of C/C++
> and POSIX threads, we can't wait for the respective language
> standardization committees to come up with a solution.

  You already have it, but you have declared it "unacceptable" and refused to
use it without stating any clear reason.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread David Daney
Gerald Pfeifer wrote:
> On Sun, 28 Oct 2007, David Daney wrote:
>   
>> You need at least 256MB of memory to compile HTML_401F.  A lot of time
>> is also useful.  If the system is not thrashing, I would give it a
>> couple of hours before calling it broken.
>> 
>
> Have we reduced memory consumption recently? 

That would be nice, but I don't think so.

>  On i386-freebsd the number
> I identified earlier this year was 700MB, 512MB definitely _not_ being 
> sufficient.  I'd be very interested in your measurements, perhaps we can
> reduce the limit somewhat!
>   

I regularly bootstrap c,c++,java on a mips-linux-gnu system with 256MB
of RAM (and 2000MB swap).

I also bootstrap c,c++ on mipsel-linux-gnu with a 128MB (also with large
swap) system.

Some files in libjava (interpreter.cc and HTML_401F.java) use a
proverbial ton of memory,but don't thrash so badly that they cannot
finish in a couple of hours with 256MB.

David Daney


Re: GCC 4.3 release schedule

2007-10-28 Thread Jason Merrill

Dennis Clarke wrote:

   Is "correctness" a feature ?


Yes, but not one that gets merged in during stage 1 :)


   I would like to see a nice clean GCC 4.2.x before GCC 4.3.zero even gets
thought of.  Why would one simply branch towards the next release when
the previous one still needs some work?  To appease sales people and
developers making noises for features?


Planning for 4.3.0 has no effect at all on the 4.2.x schedule.  People 
work on 4.2.x or not depending on their own priorities.  I've fixed a 
few bugs in 4.2 since the 4.2.2 release, and I fully expect there will 
be at least a 4.2.3 before 4.3.0 goes out.


Distributors are interested in a timely 4.3.0 because they'll be using 
whatever compiler they settle on for a long time and would like it to be 
up to date.  Sometimes some of the new features are important to support 
the needs of other parts of the system.


Jason


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Eric Botcazou
> I never noticed that this part was so long and I was too eager to do
> something else with the CPU. Sorry for the noise.

Don't be sorry, I can reproduce a compilation time surge for libjava on my 
machine (AMD64, 2.4 GHz, 1 GB).  In particular, HTML_401F.o now takes 40 min 
to compile for each version of the library.

The surge comes from the fix for PR tree-optimization/33870 (rev 129675).  
With it, 'make all-target-libjava' takes 261 min user time.  If I revert it,
do a 'make quickstrap' and run the command again, it only takes 109 min.

-- 
Eric Botcazou


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Richard Guenther
On 10/28/07, Eric Botcazou <[EMAIL PROTECTED]> wrote:
> > I never noticed that this part was so long and I was too eager to do
> > something else with the CPU. Sorry for the noise.
>
> Don't be sorry, I can reproduce a compilation time surge for libjava on my
> machine (AMD64, 2.4 GHz, 1 GB).  In particular, HTML_401F.o now takes 40 min
> to compile for each version of the library.
>
> The surge comes from the fix for PR tree-optimization/33870 (rev 129675).
> With it, 'make all-target-libjava' takes 261 min user time.  If I revert it,
> do a 'make quickstrap' and run the command again, it only takes 109 min.

I'm working on it.

Richard.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Michael Matz
Hi,

On Fri, 26 Oct 2007, Tomash Brechko wrote:

> On Fri, Oct 26, 2007 at 21:45:03 +0400, Tomash Brechko wrote:
> > Note that it doesn't cancel cmoves, as those are loads, not stores.
> 
> I just checked with x86 instruction reference, CMOVcc is reg->reg or
> mem->reg, never reg->mem.  You know God's deed when you see it. :)

I wasn't precise in what actually is the important optimization.  The 
important thing about this loop is, that the data is basically random, so 
that branch prediction has no chance to do any good.  Consequentially all 
branches in that loop have a pretty high cost.  So high in fact that it's 
better to replace it with conditional moves on the value to store and make 
the store unconditional.

So, yes, there are no conditional store instructions on x86, but the 
branches need to be removed anyway for performance, and for that we need 
to make the stores unconditional (even at the cost of perhaps introducing 
another load).

You are also right that for that example we can determine that an 
unconditional store already dominates (and postdominates) the conditional 
stores in question and hence would already be thread-unsafe, so the 
transformation would be okay even with thread-safeness in mind.

I was merely showing that this transformation _does_ matter in some cases 
to refute opposite claims which seemed to be expressed too airy in this 
thread.

Now there are multiple ways out of this dilemma, retaining the 
transformation and not breaking threaded code:
1) do the transformation only if there are already other stores in an 
   outer control region.  I see that already being worked on down-thread.
2) do the transformation but also conditionalize the address of the store:
   if (cond)
 *p = val;

   --->

   __typeof__ (*p) dummy;
   if (!cond)
 p = &dummy;  // dummy a stack slot, hence no trap, no thread 
  // implications
   *p = val;
   
I plan to work on the latter anyway somewhen as it also allows me to do 
the transformation if unconditional non-trappingness can't be proven.


Ciao,
Michael.

P.S.: I'm still somewhat disappointed about the way this discussion goes, 
it reminds me of the ugly one about signed integer overflow.  There it was 
an overly vocal set of people refusing to write ISO C which lead to a very 
intrusive change in the compiler.  Now this seems to happen again (though 
no such intrusive changes would be required right now, but perhaps for the 
other memory model).  Then and now the presumed "deficiencies" did exist 
already since years, but for some unfathomable reason only resulted in 
tempest in a teacup recently.  I don't think it's a good strategy to 
change the compiler into a strictly speaking wrong direction whenever the 
loudness of whiners reaches a certain amount.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Michael Matz
Hi,

On Fri, 26 Oct 2007, David Miller wrote:

> Also, it bears repeating that whatever performance argument you make for 
> or against this issue matters little if it breaks lots of existing and 
> working code.

It matters insofar as that existing and working code is broken in a strict 
sense.  As long as that holds there's a high chance of "breaking" it at 
random times again in the future, even when the one transformation is 
changed to not "break" it anymore.  So, we either can change the 
transformation and wait for the next uproar in a couple of months or 
somehow hope that code is fixed.  But that's all the same argumentation 
like in the signed integer overflow discussion, so my hopes for the latter 
are quite low.  I mean who am I to demand that people write correct code, 
I must be insane.

> It is also important to remind people that paper standards count less 
> than common sense and what effects users on a practical level, even when 
> those paper standards allow your favorite optimization or 
> transformation.


You mean like POSIX doesn't count very much for the kernel behaviour?


You ask us to somehow regard common sense (whatever that is) and 
practicality reasons (for which set of people?) higher than paper 
standards.  How comes then, that under linux directories are still 
seekable?  Certainly when I sometimes try to convince our kernel people of 
some clever idea, they happily use the POSIX hammer quite fine.  I sigh 
and move on.  So what exactly brings you into a position to define common 
sense or which paper standards we should ignore?


Ciao,
Michael.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Robert Dewar

Erik Trulsson wrote:


Unfortunately it seems that the POSIX standard for threads say that as long
as access to a shared variable is protected by a mutex there is no need to
use 'volatile'.


How does it say this, in some semantically precise way, or with hand
waving as in this sentence.


This means that POSIX essentially defines certain behaviours that the C
standard left undefined.


But does it do so precisely?



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Erik Trulsson
On Sun, Oct 28, 2007 at 03:03:46PM -, Dave Korn wrote:
> On 28 October 2007 13:32, Bart Van Assche wrote:
> 
> >  Requiring that all
> > thread-shared variables should be declared volatile is completely
> > unacceptable. 
> 
>   Any variable that may be altered by an external unpredictable asynchronous
> 'force majeure' must be declared volatile or the behaviour is undefined.  Your
> code is simply incorrect, and you appear to be demanding that the language
> standards and the compiler all be revised to make the buggy code valid.


Unfortunately it seems that the POSIX standard for threads say that as long
as access to a shared variable is protected by a mutex there is no need to
use 'volatile'.

This means that POSIX essentially defines certain behaviours that the C
standard left undefined.

Personally I think the POSIX standard is broken in this regard, but if
programs that are valid according to POSIX are to work correctly then it is
not sufficient for the compiler to follow the C standard.  It must also not
break any of the guarantees that POSIX makes.


> 
> > We need a solution today for the combination of C/C++
> > and POSIX threads, we can't wait for the respective language
> > standardization committees to come up with a solution.
> 
>   You already have it, but you have declared it "unacceptable" and refused to
> use it without stating any clear reason.
> 
> cheers,
>   DaveK
> -- 
> Can't think of a witty .sigline today
> 

-- 

Erik Trulsson
[EMAIL PROTECTED]


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Richard Guenther
On 10/28/07, Erik Trulsson <[EMAIL PROTECTED]> wrote:
> On Sun, Oct 28, 2007 at 03:03:46PM -, Dave Korn wrote:
> > On 28 October 2007 13:32, Bart Van Assche wrote:
> >
> > >  Requiring that all
> > > thread-shared variables should be declared volatile is completely
> > > unacceptable.
> >
> >   Any variable that may be altered by an external unpredictable asynchronous
> > 'force majeure' must be declared volatile or the behaviour is undefined.  
> > Your
> > code is simply incorrect, and you appear to be demanding that the language
> > standards and the compiler all be revised to make the buggy code valid.
>
>
> Unfortunately it seems that the POSIX standard for threads say that as long
> as access to a shared variable is protected by a mutex there is no need to
> use 'volatile'.

Which is a very unpracticable say, as it essentially would force the compiler
to assume every variable is protected by a mutex (how should it prove
otherwise?)

Richard.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Erik Trulsson
On Sun, Oct 28, 2007 at 01:10:00PM -0400, Robert Dewar wrote:
> Erik Trulsson wrote:
> 
>> Unfortunately it seems that the POSIX standard for threads say that as 
>> long
>> as access to a shared variable is protected by a mutex there is no need to
>> use 'volatile'.
> 
> How does it say this, in some semantically precise way, or with hand
> waving as in this sentence.

I don't know.  I don't have access to the POSIX standard itself so I have
to depend on other peoples description of what POSIX says. (Thus my use of
'seems' above.)
Everything I have found seem to agree that POSIX does not require the use of
volatile though.


>> This means that POSIX essentially defines certain behaviours that the C
>> standard left undefined.
> 
> But does it do so precisely?

I doubt it.

Personally I suspect that "C + pthreads" is simply not well-defined
currently, and that almost every single program out there that uses pthreads
is depending on undefined behaviour.  Taking a hard-line stance on that
seems unlikely to be very popular or useful, though.



-- 

Erik Trulsson
[EMAIL PROTECTED]


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Andreas Schwab
Erik Trulsson <[EMAIL PROTECTED]> writes:

> I don't have access to the POSIX standard itself

See .

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
PGP key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Erik Trulsson
On Sun, Oct 28, 2007 at 06:29:44PM +0100, Richard Guenther wrote:
> On 10/28/07, Erik Trulsson <[EMAIL PROTECTED]> wrote:
> > On Sun, Oct 28, 2007 at 03:03:46PM -, Dave Korn wrote:
> > > On 28 October 2007 13:32, Bart Van Assche wrote:
> > >
> > > >  Requiring that all
> > > > thread-shared variables should be declared volatile is completely
> > > > unacceptable.
> > >
> > >   Any variable that may be altered by an external unpredictable 
> > > asynchronous
> > > 'force majeure' must be declared volatile or the behaviour is undefined.  
> > > Your
> > > code is simply incorrect, and you appear to be demanding that the language
> > > standards and the compiler all be revised to make the buggy code valid.
> >
> >
> > Unfortunately it seems that the POSIX standard for threads say that as long
> > as access to a shared variable is protected by a mutex there is no need to
> > use 'volatile'.
> 
> Which is a very unpracticable say, as it essentially would force the compiler
> to assume every variable is protected by a mutex (how should it prove
> otherwise?)

Not quite, but nearly so.  There are some situations where the compiler can
prove that a variable cannot be shared - for example a variable which is
local to a function, where that function never passes the address of that
variable to any other function (and where that function itself does not
create any new threads).  In that case no other thread can know the address
of that variable and thus it cannot be shared.




-- 

Erik Trulsson
[EMAIL PROTECTED]


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Tomash Brechko
On Sun, Oct 28, 2007 at 17:51:57 +0100, Michael Matz wrote:
> I was merely showing that this transformation _does_ matter in some cases 
> to refute opposite claims which seemed to be expressed too airy in this 
> thread.

You got my intent all wrong.  Performance matters for both sides.  And
currently the only option for multithreaded programs is to use
volatile, which _greatly_ hurts performance.

What I was trying to say, is that it would be nice to have
-fno-thread-unsafe-optimization option.  And I was trying to say that
when you _enable_ this option, the performance won't be hurt much,
while the program will become thread-safe.  I never even said that
this option should be the default (though it would be reasonable for
-pthread or -fopenmp).  But there are obviously people who think
there's no need in such option whatsoever, because "threaded code is
broken by definition, and I don't write it anyway".

Even if mutithreading is of no immediate concern for you, it will
become tomorrow then you decide to run your loop on all 1024 cores
your cell phone provides.  So you can't argue that this option
wouldn't be nice to have, no?


And as I understood this discussion, there will be such option in GCC
in the nearest future.


-- 
   Tomash Brechko


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Dave Korn
On 28 October 2007 17:39, Erik Trulsson wrote:

> On Sun, Oct 28, 2007 at 01:10:00PM -0400, Robert Dewar wrote:
>> Erik Trulsson wrote:
>> 
>>> Unfortunately it seems that the POSIX standard for threads say that as
>>> long as access to a shared variable is protected by a mutex there is no
>>> need to use 'volatile'.
>> 
>> How does it say this, in some semantically precise way, or with hand
>> waving as in this sentence.
> 
> I don't know.  I don't have access to the POSIX standard itself so I have
> to depend on other peoples description of what POSIX says. (Thus my use of
> 'seems' above.)
> Everything I have found seem to agree that POSIX does not require the use of
> volatile though.

  As far as I know, there is no separate 'pthreads' spec apart from what is
defined in the Threads section (2.9) of the SUS (http://tinyurl.com/2wdq2u)
and what it says about the various pthread_ functions in the system interfaces
(http://tinyurl.com/2r7c5k) chapter.  None of that, as far as I have been able
to determine, makes any kind of claims about access to shared state or the use
of volatile.


cheers,
  DaveK

-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Tomash Brechko
On Sun, Oct 28, 2007 at 21:03:09 +0300, Tomash Brechko wrote:
> What I was trying to say, is that it would be nice to have
> -fno-thread-unsafe-optimization option.

Rather clear -fno-speculative-store, in the light of mprotect() and
non-writable memory.


-- 
   Tomash Brechko


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Eric Botcazou
> I'm working on it.

Thanks.  However, don't we simply void the benefit of memory partitioning by 
recursing on the MPTs?

-- 
Eric Botcazou


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Erik Trulsson
On Sun, Oct 28, 2007 at 06:06:17PM -, Dave Korn wrote:
> On 28 October 2007 17:39, Erik Trulsson wrote:
> 
> > On Sun, Oct 28, 2007 at 01:10:00PM -0400, Robert Dewar wrote:
> >> Erik Trulsson wrote:
> >> 
> >>> Unfortunately it seems that the POSIX standard for threads say that as
> >>> long as access to a shared variable is protected by a mutex there is no
> >>> need to use 'volatile'.
> >> 
> >> How does it say this, in some semantically precise way, or with hand
> >> waving as in this sentence.
> > 
> > I don't know.  I don't have access to the POSIX standard itself so I have
> > to depend on other peoples description of what POSIX says. (Thus my use of
> > 'seems' above.)
> > Everything I have found seem to agree that POSIX does not require the use of
> > volatile though.
> 
>   As far as I know, there is no separate 'pthreads' spec apart from what is
> defined in the Threads section (2.9) of the SUS (http://tinyurl.com/2wdq2u)
> and what it says about the various pthread_ functions in the system interfaces
> (http://tinyurl.com/2r7c5k) chapter.  None of that, as far as I have been able
> to determine, makes any kind of claims about access to shared state or the use
> of volatile.

Having just been pointed to that copy of the SUS, I must agree.  I can't
find anything in there saying anything at all about what is required to
safely share data between threads.  If that is really so it seems 'pthreads'
are even more under-specified than I thought (and I had fairly low
expectations in that regard.)
I really hope there is something I have missed.



-- 

Erik Trulsson
[EMAIL PROTECTED]


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Richard Guenther
On 10/28/07, Eric Botcazou <[EMAIL PROTECTED]> wrote:
> > I'm working on it.
>
> Thanks.  However, don't we simply void the benefit of memory partitioning by
> recursing on the MPTs?

Yes.  At least what compile-time is concerned (we still have less VOPs).  The
patch I just committed avoids some/most parts of the recursion.

The other option we have to fix the miscompile(s) is to put all SFTs of a
SFT parent var into the same partition always.  Which would essentially
either put all SFTs of a variable into a single MPT or none of the SFTs of
a variable into any MPT.

Let's see what the results are on the now "optimal" first strategy.

Richard.


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Gerald Pfeifer
On Sun, 28 Oct 2007, Richard Guenther wrote:
>> Don't be sorry, I can reproduce a compilation time surge for libjava on 
>> my machine (AMD64, 2.4 GHz, 1 GB).  In particular, HTML_401F.o now 
>> takes 40 min to compile for each version of the library.
>>
>> The surge comes from the fix for PR tree-optimization/33870 (rev 129675).
> I'm working on it.

This is pretty tricky a case -- HTML_401F.o already is the single reason 
why GCC needs close to 700 MB of virtual memory to bootstrap.  It would
be really good to see this reduced.  Tom gave it a try by splitting the
input early this year, but this mostly seems like an issue of the middle 
end.

Gerald


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Gerald Pfeifer
On Sun, 28 Oct 2007, David Daney wrote:
>>  On i386-freebsd the number I identified earlier this year was 700MB, 
>> 512MB definitely _not_ being sufficient.  I'd be very interested in 
>> your measurements, perhaps we can reduce the limit somewhat!
> I regularly bootstrap c,c++,java on a mips-linux-gnu system with 256MB
> of RAM (and 2000MB swap).

Here we go!  This means your system actually features 2256MB of memory, 
not just 256MB. ;-)  Sorry for not making it more clear that I was really
referring to overall memory, not just main memory.  According to my tests 
our peak in virtual memory use is a bit below 700MB while building libgcj.

Gerald


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 18:29 +0100, Richard Guenther wrote:
> On 10/28/07, Erik Trulsson <[EMAIL PROTECTED]> wrote:

> > Unfortunately it seems that the POSIX standard for threads say that as long
> > as access to a shared variable is protected by a mutex there is no need to
> > use 'volatile'.
> 
> Which is a very unpracticable say, as it essentially would force the compiler
> to assume every variable is protected by a mutex (how should it prove
> otherwise?)

So the proof is easy: mutex ops are function calls,
assume all function calls lock or unlock.

Thus: store registers aliasing sharable variables into 
those variables on every function call.

int x = 1;
x = x + 1; // r0 <- x; r0++
x = x + 1; // r0++;
f();   // x <- r0; f();

Note: this is not well stated. There is no explicit coupling
between a given variable and a mutex.

If thread A locks Mutex MA, and B locks MB, there is no synchronisation
between these threads and sharing can fail: it has to be the same
mutex (to effect 'mutual exclusion').

When two threads are exclusive, it is safe to keep variables
in registers again (because the other thread is locked up).

OK .. hmm .. well this is the idea, but a more formal proof
would be cool.


-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread Dominique Dhumieres
There is something I don't understand: why the problem shows only (mainly) in 
jc1?
If a similar increase had happened in gfortran, I (and others) would have seen 
it.

Dominique


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 19:36 +0100, Erik Trulsson wrote:

> 
> Having just been pointed to that copy of the SUS, I must agree.  I can't
> find anything in there saying anything at all about what is required to
> safely share data between threads.  If that is really so it seems 'pthreads'
> are even more under-specified than I thought (and I had fairly low
> expectations in that regard.)
> I really hope there is something I have missed.

Clearly when two threads are both
running, one write to a variable means no other thread can
safely read or write it (assuming not atomic).

Mutex prevents more than one thread entering a particular
piece of code (that's defined, right?)

So the idea is clearly that once this is done, it is safe
to read or write a variable because no one else will.

Clearly the programmer must ensure no one else does.

Now, you are right Posix does not specify it is safe.

What you miss is that it doesn't have to: if it were
not safe, mutex would be useless, and since Posix specifies
Mutex it intends it to be useful.. so it follows that
it is safe .. no volatile required... :)

Remember Posix is an ISO Standard which codifies existing
practice and everyone makes the above assumptions.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Bart Van Assche
On 10/28/07, Robert Dewar <[EMAIL PROTECTED]> wrote:
> Bart Van Assche wrote:
>
> > My opinion is that, given the importance of multithreading, it should
> > be documented in the gcc manual which optimizations can cause trouble
> > in multithreaded software (such as (3) and (4)). It should also be
> > documented which compiler flags must be used to disable optimizations
> > that cause trouble for multithreaded software. Requiring that all
> > thread-shared variables should be declared volatile is completely
> > unacceptable.
>
> Why is this unacceptable .. seems much better to me than writing
> undefined stuff.

Requiring that all thread-shared variables must be declared volatile,
even those protected by calls to synchronization functions, implies
that all multithreaded code and all existing libraries have to be
changed. Functions like snprintf() write data to a buffer provided by
the caller. If all thread-shared variables should be declared
volatile, then a second version of snprintf() would be required with
volatile char* as the datatype of the first argument instead of char*.
Quite a hefty requirement if you ask me ...

Bart Van Assche.


GNU GCC -m32 Problem?

2007-10-28 Thread Joseph North
Dear Mr. Randi J. Rost, et al.:


   I have a copy of your book entitled, OpenGL Shading Language,
Second  Edition, T 385 R665 2006 MAIN, UT Austin, and learned
that you have also worked with Motif.
   I have had a problem getting my "long double" version of
XEphem 3.7.2 to build under Red Hat Linux Fedora 7, x86_64
on an AMD Athlon 64 X2 Dual-Core Processor 5600+ based
PC.
   It builds OK under Red Hat Linux Fedora Core 6 on a Celeron
based 32-bit computer.  I could then bring the 32-bit application
executable to the 64-bit PC, and it worked fine also.
   However, I could not get it to build (using make) on the 64-bit PC,
even when I had specified ~ gcc -m32 . . ., using the latest GNU
GCC Compiler Collection (i.e., gcc-4.1.2-33.x86_64.rpm, et caetera,
installed).  Also, when I tried a simple C program - NO Motif at all
involved - I got compile error messages related to ~ stdlib.h, ~ 32 bit
stub missing, I seem to recall, for, I'm not at home now.
   I obviously would like to be able to also build on my 64-bit PC,
as well as develop and run in that environment entirely.
   Any assistance would be most gratefully appreciated.
   Tempus fugit et ad augusta per angusta.


  Nil desparare (Gauss),

  Joseph Roy D. North
  LeRoi  Du Nord
  3220 Duval Road, Apt. 1110
  Austin, TX 78759-3524, USA


Scientia est Potentia!

I Prefer Pi (a palindrome)!


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Richard Guenther
On 10/28/07, skaller <[EMAIL PROTECTED]> wrote:
>
> On Sun, 2007-10-28 at 18:29 +0100, Richard Guenther wrote:
> > On 10/28/07, Erik Trulsson <[EMAIL PROTECTED]> wrote:
>
> > > Unfortunately it seems that the POSIX standard for threads say that as 
> > > long
> > > as access to a shared variable is protected by a mutex there is no need to
> > > use 'volatile'.
> >
> > Which is a very unpracticable say, as it essentially would force the 
> > compiler
> > to assume every variable is protected by a mutex (how should it prove
> > otherwise?)
>
> So the proof is easy: mutex ops are function calls,
> assume all function calls lock or unlock.
>
> Thus: store registers aliasing sharable variables into
> those variables on every function call.
>
> int x = 1;
> x = x + 1; // r0 <- x; r0++
> x = x + 1; // r0++;
> f();   // x <- r0; f();
>
> Note: this is not well stated. There is no explicit coupling
> between a given variable and a mutex.
>
> If thread A locks Mutex MA, and B locks MB, there is no synchronisation
> between these threads and sharing can fail: it has to be the same
> mutex (to effect 'mutual exclusion').
>
> When two threads are exclusive, it is safe to keep variables
> in registers again (because the other thread is locked up).
>
> OK .. hmm .. well this is the idea, but a more formal proof
> would be cool.

Doesn't work:

int a;
void foo(bool locked)
{
  if (locked)
a++;
}

void bar(void)
{
  pthread_mutex_lock (&mx);
   foo(true);
  pthread_mutex_unlock(&mx);
}

you cannot do such analysis without seeing the whole program.

Richard.


Re: GNU GCC -m32 Problem?

2007-10-28 Thread Joseph North
On 10/28/07, Kai Ruottu <[EMAIL PROTECTED]> wrote:
> Joseph North wrote:
> >I have had a problem getting my "long double" version of
> > XEphem 3.7.2 to build under Red Hat Linux Fedora 7, x86_64
> > on an AMD Athlon 64 X2 Dual-Core Processor 5600+ based
> > PC.
> >It builds OK under Red Hat Linux Fedora Core 6 on a Celeron
> > based 32-bit computer.  I could then bring the 32-bit application
> > executable to the 64-bit PC, and it worked fine also.
> >However, I could not get it to build (using make) on the 64-bit PC,
> > even when I had specified ~ gcc -m32 . . ., using the latest GNU
> > GCC Compiler Collection (i.e., gcc-4.1.2-33.x86_64.rpm, et caetera,
> > installed).  Also, when I tried a simple C program - NO Motif at all
> > involved - I got compile error messages related to ~ stdlib.h, ~ 32 bit
> > stub missing, I seem to recall, for, I'm not at home now.
> >
> The 32-bit development libraries maybe are not installed... But at least
> that 'stdlib.h' should be common
> for both the default 64-bit and the optional 32-bit libraries, so maybe
> also the 64-bit libraries were
> not installed...
>
> I myself have the 32-bit Fedora7 and remember that those "C/C++
> development" things, wanted
> or not, were asked when installing...  For bare "kernel compiling" those
> C libraries are not needed.
>
>



Dear Kai:


   Thank you!
   I'll try to check into the details when I get back home, and let you know
more.
   I thought the entire GCC suite (x86_64 version) installed OK.  I believe
I just need that suite for "-m32" to work OK.
   Tempus fugit et ad augusta per angusta.


  Nil desparare (Gauss),

  Joseph Roy D. North
  LeRoi  Du Nord
  3220 Duval Road, Apt. 1110
  Austin, TX 78759-3524, USA


Scientia est Potentia!

I Prefer Pi (a palindrome)!


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: Michael Matz <[EMAIL PROTECTED]>
Date: Sun, 28 Oct 2007 18:08:23 +0100 (CET)

> I mean who am I to demand that people write correct code, 
> I must be insane.

Correctness is defined by pervasive common usage as much as it
is by paper standards.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: Michael Matz <[EMAIL PROTECTED]>
Date: Sun, 28 Oct 2007 18:08:23 +0100 (CET)

> 
> You mean like POSIX doesn't count very much for the kernel behaviour?
> 

Nice scarecrow.

Linux has and will break POSIX where POSIX asks unreasonable and
stupid things.

And in particular we will not follow POSIX if doing so breaks
pervasive practices in userspace that have worked under Linux for a
long time.

We do not follow paper standards blindly.  Practical considerations
alway trump standards.  Standards are often wrong or it's authors did
not consider a particular case sufficiently.


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Dave Korn
On 29 October 2007 01:01, David Miller wrote:

> From: Michael Matz <[EMAIL PROTECTED]>
> Date: Sun, 28 Oct 2007 18:08:23 +0100 (CET)
> 
>> 
>> You mean like POSIX doesn't count very much for the kernel behaviour?
>> 
> 
> Nice scarecrow.
> 
> Linux has and will break POSIX where POSIX asks unreasonable and
> stupid things.
> 
> And in particular we will not follow POSIX if doing so breaks
> pervasive practices in userspace that have worked under Linux for a
> long time.
> 
> We do not follow paper standards blindly.  Practical considerations
> alway trump standards.  Standards are often wrong or it's authors did
> not consider a particular case sufficiently.


  "My way is right and everyone else's is wrong".

  Better write your own compiler then.


cheers,
  DaveK
-- 
Can't think of a witty .sigline today



RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Dave Korn
On 28 October 2007 18:03, Tomash Brechko wrote:

> You got my intent all wrong.  Performance matters for both sides.  And
> currently the only option for multithreaded programs is to use
> volatile, which _greatly_ hurts performance.
> 
> What I was trying to say, is that it would be nice to have
> -fno-thread-unsafe-optimization option.  And I was trying to say that
> when you _enable_ this option, the performance won't be hurt much,
> while the program will become thread-safe.  

  Thing is, if you disable all optimisations that are potentially unsafe in
the presence of threads, won't you just get the same effect as if you had used
volatile anyway, only on every single variable in the program instead of just
the ones the programmer has designated as sensitive?

> I never even said that
> this option should be the default (though it would be reasonable for
> -pthread or -fopenmp).  

  Well, as I said before, I'm not going to complain about some optional flag
that limits the compiler's optimisations, but I think that what's going to
happen is that you're going to find another race condition caused by another
optimiser, and want to add disabling that optimisation as well to this new
flag, and there's going to be a long process of repeated cycles of this, until
what you end up with is a flag that has the exact same effect as volatile.

> But there are obviously people who think
> there's no need in such option whatsoever, because "threaded code is
> broken by definition, and I don't write it anyway".
> 
> Even if mutithreading is of no immediate concern for you, it will
> become tomorrow then you decide to run your loop on all 1024 cores
> your cell phone provides.  

  This is smug patronising nonsense.  There are a lot of people around here
who've been writing multithreaded code for decades.  Not to mention
multi-processor code.  In both symmetric and asymmetric setups.  With and
without all kinds of coherent and non-coherent caching considerations to take
into account.  Please try not to claim that you are some kind of far sighted
visionary preaching the future to a bunch of ignorami; we've all been doing it
for years, and we do have a pretty good grasp of the concepts.

> So you can't argue that this option
> wouldn't be nice to have, no?

  I haven't needed it yet.  I mark my volatile variables as volatile, rather
than expecting the compiler to treat all variables the same indiscriminately.
You're right that volatile doesn't solve the whole problem; you still have to
write correctly threadsafe code, you still have to use locks, and you still
have to be aware of the huge complexities of lock-free code, but what it does
give you is the essential tool you need from the compiler for that: a strict
one-to-one relationship between the loads and stores in your high level source
and those in the emitted assembler.

> And as I understood this discussion, there will be such option in GCC
> in the nearest future.

  Well, as I don't mind confirming once more, if you really want
-fshoot-yourself-in-the-foot, you're welcome to it.

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Infinite loop in compiling javax/swing/text/html/parser/HTML_401F.list

2007-10-28 Thread David Daney
Gerald Pfeifer wrote:
> On Sun, 28 Oct 2007, David Daney wrote:
>   
>>>  On i386-freebsd the number I identified earlier this year was 700MB, 
>>> 512MB definitely _not_ being sufficient.  I'd be very interested in 
>>> your measurements, perhaps we can reduce the limit somewhat!
>>>   
>> I regularly bootstrap c,c++,java on a mips-linux-gnu system with 256MB
>> of RAM (and 2000MB swap).
>> 
>
> Here we go!  This means your system actually features 2256MB of memory, 
> not just 256MB. ;-)  Sorry for not making it more clear that I was really
> referring to overall memory, not just main memory.  According to my tests 
> our peak in virtual memory use is a bit below 700MB while building libgcj.
>
>   
I don't dispute that this is true.  However since most modern systems 
are able to swap to disk, it is often more useful to know the minimum 
amount of RAM required as it can be much more difficult to add RAM to a 
system than increase the size of its swap device.

David Daney



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: "Dave Korn" <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 01:05:06 -

>   "My way is right and everyone else's is wrong".

I didn't say that.  I said that what users do on a broad scale is an
important consideration that often trumps paper standards.  And yes,
users as well as the implementors themselves do in fact get to be a
part of making that determination.

Standards are also not infallible laws that should be followed
blindly.

More importantly, you cannot break things on people out of mere
convenience.

The paper standards don't matter if that's not what people actually
do.  Nobody marks all of their thread and signal accessed shared
variables as volatile, and telling them to do so does not solve the
problem.  Rather, it just infuriates those users.

Find me one OS kernel code base written in the C language that marks
all lock protected variables as volatile?  And no you cannot cop out
from this obvious example merely by saying that none of them are truly
written in the "C language."

Again, standards should be strongly questioned when they do not
acknowledge and co-exist with wide spread existing practice.

>   Better write your own compiler then.

If this becomes the common attitude of GCC developers, you can pretty
much guarentee this will drive people to work on LLVM and other
alternative compiler code bases.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: "Dave Korn" <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 01:16:07 -

>   Thing is, if you disable all optimisations that are potentially
> unsafe in the presence of threads, won't you just get the same
> effect as if you had used volatile anyway, only on every single
> variable in the program instead of just the ones the programmer has
> designated as sensitive?

This is not really what is being suggested at all.

The compiler simply cannot speculatively load or store to variables
with global visibility.

Suggesting volatile is totally impractical and in fact overkill.

Even basic correct single-threaded UNIX programs are broken by these
speculative stores.  If I use a conditional test to protect access to
memory mmap()'d with a read-only attribute, GCC's optimization will
cause write-protection exceptions.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Mark Mielke

David Miller wrote:

From: Michael Matz <[EMAIL PROTECTED]>
Date: Sun, 28 Oct 2007 18:08:23 +0100 (CET)
  
I mean who am I to demand that people write correct code, 
I must be insane.



Correctness is defined by pervasive common usage as much as it
is by paper standards.
  
Reading this thread, I find myself confused. GCC is used regularly for 
both multi-threaded and single-threaded code. It is impractical to 
require all variables that may be shared between threads to be declared 
volatile. Worse, I find myself suspecting it may be impossible. Any 
particular library may be used from a multi-threaded context or a 
single-threaded context, with a very common belief that the access can 
be protected by wrapping all accesses to the thread-unsafe resource with 
a mutex. Are some people here really suggesting that all variables 
everywhere be declared volatile?


I remain unconvinced that declaring these shared variables "volatile" is 
correct. Certainly, if the ordering of reads and writes must be 
carefully controlled completely by the programmer, volatile should be 
used. Most uses are not like this. Most uses require only lose ordering. 
The lose ordering is providing by a mutex or other synchronization 
primitive. As any function call might call a synchronization primitive, 
this would mean that any function call should ensure that all scheduled 
reads or writes to shared data before the function is called, be 
performed before the function is called. Similarly, all such data may 
have changed by the time the function returns. Unless the function can 
be proven to have no effect (global optimization analysis? function 
inlining?), this is expected behavior.


Am I stating the obvious? Is this an unreasonable expectation for some 
reason? Do I not understand the issue?


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 22:41 +0100, Richard Guenther wrote:

> > OK .. hmm .. well this is the idea, but a more formal proof
> > would be cool.
> 
> Doesn't work:

Of course it works.

> you cannot do such analysis without seeing the whole program.

There's no need. A mutex is assumed at each function call.
That is, registers are dumped to variables at each

* call
* function entry
* function return

This means you cannot merely, say, push caller save
registers when calling a function, and you cannot leave
values in callee save registers, if the variable aliased
is sharable. 

In your example:

int a;
void foo(bool locked)
{
  if (locked)
a++;
}

I see no problem, a is in memory, you can safely do

if(!locked) goto end;
r0 <- a; r0++; a <- r0;
end: return;

Since 'a' here is sharable, the function can assume it
is not aliased in a register, load and increment it
and store it back.

It doesn't matter then, whether there is a mutex or not.
In fact, it doesn't matter if locked is true or false.

I also can't see anything at all is lost here.


-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


RE: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Dave Korn
On 29 October 2007 01:38, David Miller wrote:

> From: "Dave Korn" <[EMAIL PROTECTED]>
> Date: Mon, 29 Oct 2007 01:16:07 -
> 
>>   Thing is, if you disable all optimisations that are potentially
>> unsafe in the presence of threads, won't you just get the same
>> effect as if you had used volatile anyway, only on every single
>> variable in the program instead of just the ones the programmer has
>> designated as sensitive?
> 
> This is not really what is being suggested at all.
> 
> The compiler simply cannot speculatively load or store to variables
> with global visibility.

  You'll be back.  Next week, you'll discover a corner case where caching a
shared variable in a register can be a bad thing when one thread uses locks
and the other doesn't, and you'll be back to demand that optimisation is
removed as well.

  BTW, you and Tomash should get your stories in synch.  He says speculative
loads are ok, just no stores, and wants a kind of half-volatile flag that
would only suppress stores.  I think you're already looking one step further
down the road than he is and have realised that speculative loads will give
you problems too.
 
> Suggesting volatile is totally impractical and in fact overkill.

  I keep hearing this claim, but nobody says why.  What /else/ does it do that
isn't necessary for correctness in this (or other) case(s)?

> Even basic correct single-threaded UNIX programs are broken by these
> speculative stores.  If I use a conditional test to protect access to
> memory mmap()'d with a read-only attribute, GCC's optimization will
> cause write-protection exceptions.

  Hmm, that's a far more substantial argument.  It raises the question: is the
compiler entitled to assume that a non-const pointer always points to
non-const data?

cheers,
  DaveK
-- 
Can't think of a witty .sigline today



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: "Dave Korn" <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 02:39:15 -

>   BTW, you and Tomash should get your stories in synch.  He says
> speculative loads are ok, just no stores, and wants a kind of
> half-volatile flag that would only suppress stores.  I think you're
> already looking one step further down the road than he is and have
> realised that speculative loads will give you problems too.

Probably speculative loads are OK, as long as function calls
to functions the compiler cannot see the complete implementation
of form an implicit boundary (ie. any memory might be modified)
which is happily does already.

In what cases those speculative loads are profitable is another
matter, given how expensive cache misses are compared to mispredicted
branches.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: "Dave Korn" <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 02:39:15 -

> On 29 October 2007 01:38, David Miller wrote:
> 
> > Even basic correct single-threaded UNIX programs are broken by these
> > speculative stores.  If I use a conditional test to protect access to
> > memory mmap()'d with a read-only attribute, GCC's optimization will
> > cause write-protection exceptions.
> 
>   Hmm, that's a far more substantial argument.  It raises the question: is the
> compiler entitled to assume that a non-const pointer always points to
> non-const data?

Using mrprotect() to mark pages of garbage collection memory read-only
in the compiler in order to speed up GC sweeps done during compilation
has been suggested at times in the past.  The idea is that pages
marked read-only are elided from the GC scan lists (their state
remains the same if nobody writes to them) and to trap write access
exceptions via a signal handler, which puts back the write capability
for that page, and adds the page to the GC scan lists before returning
from the signal handler.

If GCC ever used this kind of technique, we can then proclaim with joy
that even GCC is not a properly written C program!

To me it's pretty clear that speculative stores have to be done with
extreme care, if at all.  Right now we know of many real life every
day examples that break because of them: threaded programs, OS
kernels, programs using signal handlers, and anything using
mprotect() in sophisticated ways such as garbage collectors.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 21:43 +0100, Bart Van Assche wrote:
> On 10/28/07, Robert Dewar <[EMAIL PROTECTED]> wrote:
> > Bart Van Assche wrote:
> >
> > > My opinion is that, given the importance of multithreading, it should
> > > be documented in the gcc manual which optimizations can cause trouble
> > > in multithreaded software (such as (3) and (4)). It should also be
> > > documented which compiler flags must be used to disable optimizations
> > > that cause trouble for multithreaded software. Requiring that all
> > > thread-shared variables should be declared volatile is completely
> > > unacceptable.
> >
> > Why is this unacceptable .. seems much better to me than writing
> > undefined stuff.
> 
> Requiring that all thread-shared variables must be declared volatile,
> even those protected by calls to synchronization functions, implies
> that all multithreaded code and all existing libraries have to be
> changed.

Yes, of course that is out of the question. Instead all shared
variables are treated as sharable.

This is NOT the same as volatile. Sharable variables need to
be 'de-registered' (dumped out of aliasing registers) at
function call boundaries. This is MUCH less strict than
volatile, which is at every sequence point.

If the function call is to a visible function, gcc can look in it
to see if it might fiddle a mutex, and if not, there's no need
to dump the registers. In particular, there's no synchronisation
point when a function is inlined.

IMHO the effect of this is to change the optimiser so that local
variables not addressed are preferred for lifting to registers
over globals or addressed locals.

C++ non-static members must be treated as sharable, even
if they're private, however this is not necessarily a big
deal if they're inline.

The effect seems to be that this:

if(cond)x++;

can quite safely be replaced by

r0 <- x;
if(cond) r0++;
x <- r0;

which is the topic of this discussion.

If this code mutually excludes other accesses to x, then it is
safe, and if it doesn't, then the programmer is responsible
for writing undefined behaviour, not the compiler.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 18:34 -0700, David Miller wrote:

> More importantly, you cannot break things on people out of mere
> convenience.

If you want a case of this .. its the ill-considered strict aliasing
rules in C. WG14 seems to think C had a strong enough type system
to make this rule, but it does not. So gcc provides a switch
to turn it off.

This is actually a bit annoying, because the granularity is not
so sweet: I'd be happy with floating point aliasing to be strict,
but not integers and *definitely* not pointers.

C is too brain dead for strict aliasing: it could break many
memory management codes which, for example, alias memory
with pointer to void* for alignment purposes, or intptr_t 
for bit fiddling. Or code like this which I write:

struct X { int x; } x;
struct Y { int y[1]; } y;
Y *py  = (Y*)(void*)&x;
X *px = (X*)(void*)&y;

[My Felix compiler does this cast systematically and deliberately]

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 18:37 -0700, David Miller wrote:
> From: "Dave Korn" <[EMAIL PROTECTED]>
> Date: Mon, 29 Oct 2007 01:16:07 -

> The compiler simply cannot speculatively load or store to variables
> with global visibility.

I think it can.

> Suggesting volatile is totally impractical and in fact overkill.
> 
> Even basic correct single-threaded UNIX programs are broken by these
> speculative stores.  If I use a conditional test to protect access to
> memory mmap()'d with a read-only attribute, GCC's optimization will
> cause write-protection exceptions.

That is the programmers fault, they should have accessed the 
variable using a const. Failing to do so gives the compiler
permission to write speculatively.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: skaller <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 14:21:10 +1100

> 
> On Sun, 2007-10-28 at 18:37 -0700, David Miller wrote:
> > Even basic correct single-threaded UNIX programs are broken by these
> > speculative stores.  If I use a conditional test to protect access to
> > memory mmap()'d with a read-only attribute, GCC's optimization will
> > cause write-protection exceptions.
> 
> That is the programmers fault, they should have accessed the 
> variable using a const. Failing to do so gives the compiler
> permission to write speculatively.

I do not agree with you.

It is perfectly legal to use read-only protection to implement
things like efficient garbage collection scans.

It's not even write exceptions, what about the pointer being
valid at all?

Memory accesses really are special.  You can only execute them when
the program would have allowed them to occur, otherwise you risk
taking exceptions.

Do you really think that:

the_pointer_is_valid = func(potentially_bad_pointer);
if (the_pointer_is_valid)
*potentially_bad_pointer++;

should generate any memory accesses when 'the_pointer_is_valid'
evaluates to false?

And yet this is just another form of our original "threading" example:

if (pthread_mutex_trylock(lock))
*counter++;

It shows that memory accesses are a fundamental issue.

Only if you can prove that the program would access said memory with
said kind of access (read or write) can you legally speculate.

Happily it seems that for the cases where it helps code generation
substantially, this precondition is true.


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Mark Mielke

Dave Korn wrote:

On 29 October 2007 01:38, David Miller wrote:

  
  You'll be back.  Next week, you'll discover a corner case where caching a

shared variable in a register can be a bad thing when one thread uses locks
and the other doesn't, and you'll be back to demand that optimisation is
removed as well.
  

Why would David ask for something so unreasonable?

Why do you believe that the use of mutex to synchronize access to a 
shared resource, without the use of volatile on the shared resource 
being accessed, as documented in countless real life examples, is 
unreasonable or incorrect?


I do not understand your position.

Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread skaller

On Sun, 2007-10-28 at 20:32 -0700, David Miller wrote:
> From: skaller <[EMAIL PROTECTED]>

> > That is the programmers fault, they should have accessed the 
> > variable using a const. Failing to do so gives the compiler
> > permission to write speculatively.
> 
> I do not agree with you.

Yeah, on consideration you're probably right.

> It is perfectly legal to use read-only protection to implement
> things like efficient garbage collection scans.

Yes. And I'm wrong about 'const': my way you'd have to make it
const and cast to non-const to prevent speculative writes.
That's unworkable ..

> It's not even write exceptions, what about the pointer being
> valid at all?

That's a different case. 

Gcc already provides a way to do this on say AMD64,
using __builtin_prefetch. That instruction is perfectly
legal on an invalid address. Yeah I know this isn't a
complete load (the actual load into a register has to
be done as well).

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Darryl Miles

David Miller wrote:

The compiler simply cannot speculatively load or store to variables
with global visibility.


s/with global visibility/with visibility outside the scope of the 
functional unit the compiler is able to see at compile time/


Which basically means the compiler is king for doing these tricks with 
CPU registers, areas of the stack and inlined functional units in which 
it can be 100% sure about it access to this data.



What are the issues with "speculative loads" ?  Is there such a page as 
a write only page used by any system GCC targets ?  For general usage 
the x86 concept of read-only or read-write fits well, which means that 
speculative load's are usually a safe optimization.


But I'd be all for a way to allow/disallow each optimization 
independently (this give the developer more choice in the matter).  With 
"speculative loads" enabled by default and "speculative stores" disabled 
by default for any multi-threaded code.


As per my other posting have the ability to 
__attribute__((disallow_speculative_load,disallow_speculative_store)) or 
to __attribute__((allow_speculative_load,allow_speculative_store)) to 
pin the issue.  With -fdisallow-speculative-load 
-fallow-speculative-load etc... for the defaults for the entire file 
being compiled.



Darryl


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Ross Smith

Erik Trulsson wrote:

On Sun, Oct 28, 2007 at 06:06:17PM -, Dave Korn wrote:


  As far as I know, there is no separate 'pthreads' spec apart from what is
defined in the Threads section (2.9) of the SUS (http://tinyurl.com/2wdq2u)
and what it says about the various pthread_ functions in the system interfaces
(http://tinyurl.com/2r7c5k) chapter.  None of that, as far as I have been able
to determine, makes any kind of claims about access to shared state or the use
of volatile.


Having just been pointed to that copy of the SUS, I must agree.  I can't
find anything in there saying anything at all about what is required to
safely share data between threads.  If that is really so it seems 'pthreads'
are even more under-specified than I thought (and I had fairly low
expectations in that regard.)
I really hope there is something I have missed.


I think the relevant part is here:
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap04.html#tag_04_10

[begin quote]

4.10 Memory Synchronization

Applications shall ensure that access to any memory location by more 
than one thread of control (threads or processes) is restricted such 
that no thread of control can read or modify a memory location while 
another thread of control may be modifying it. Such access is restricted 
using functions that synchronize thread execution and also synchronize 
memory with respect to other threads. The following functions 
synchronize memory with respect to other threads:


fork()
pthread_barrier_wait()
pthread_cond_broadcast()
pthread_cond_signal()
pthread_cond_timedwait()
pthread_cond_wait()
pthread_create()
pthread_join()
pthread_mutex_lock()
pthread_mutex_timedlock()
pthread_mutex_trylock()
pthread_mutex_unlock()
pthread_spin_lock()
pthread_spin_trylock()
pthread_spin_unlock()
pthread_rwlock_rdlock()
pthread_rwlock_timedrdlock()
pthread_rwlock_timedwrlock()
pthread_rwlock_tryrdlock()
pthread_rwlock_trywrlock()
pthread_rwlock_unlock()
pthread_rwlock_wrlock()
sem_post()
sem_trywait()
sem_wait()
wait()
waitpid()

The pthread_once() function shall synchronize memory for the first call 
in each thread for a given pthread_once_t object.


Unless explicitly stated otherwise, if one of the above functions 
returns an error, it is unspecified whether the invocation causes memory 
to be synchronized.


Applications may allow more than one thread of control to read a memory 
location simultaneously.


[end quote]


-- Ross Smith


Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread David Miller
From: Darryl Miles <[EMAIL PROTECTED]>
Date: Mon, 29 Oct 2007 04:53:49 +

> What are the issues with "speculative loads" ?

The conditional might be protecting whether the pointer is valid and
can be dereferenced at all.

int *counter;

void foo(int counter_is_valid)
{
if (counter_is_valid)
(*counter)++;
}

And in another module that GCC can't see when compiling foo():

extern int *counter;

int main(void)
{
int a = 0;

foo(0);
counter = &a;
foo(1);

return 0;
}



Re: Optimization of conditional access to globals: thread-unsafe?

2007-10-28 Thread Tomash Brechko
On Mon, Oct 29, 2007 at 02:39:15 -, Dave Korn wrote:
>   BTW, you and Tomash should get your stories in synch.  He says speculative
> loads are ok, just no stores, and wants a kind of half-volatile flag that
> would only suppress stores.  I think you're already looking one step further
> down the road than he is and have realised that speculative loads will give
> you problems too.

You don't do your homework.  This pointer
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html
(which was already posted in this thread) explains the matter, see
"Speculative code motion involving loads" section.  So both David and
me are correct.


But curious, Bart already tried _several times_ to explain why using
volatile is not an option, but his arguments seem to be too
"inconvenient" to be considered.  Let me repeat: suppose we agree that
every shared data should be annotated as volatile.  So if I want to
share dynamic data, I have to write


   _volatile_ data_type *pdata = malloc(size);


But how to use this data?  There are not many library functions that
accept pointer to volatile (and casting the qualifier away will bring
us back to the start).  Should every library function have 2^n copies
where different combinations of parameters are annotated as volatile?


I think most pro-volatile people didn't understood the meaning of
several papers in the Internet that say you have to use volatile.
Those papers never meant to say that volatile is a proper way to use
shared data with POSIX threads, rather that because the compilers are
made the way they are you have to use volatile for now to overcome
compiler thread-unawareness.


David R. Butenhof was the member of POSIX.1c (POSIX Threads)
committee.  In his book, "Programming with POSIX Threads", there are
no volatiles at all.  Of course one can say he didn't grok C, or even
POSIX, or POSIX Threads.  But it shows the intent, at least how he
felt it.

And this is the way to go: in sane world standards follow the reality,
not the other way around.  And they will, that's why the work of Hans
Boehm is there.  As it was already mentioned in this thread, while his
proposal is not final yet, most of the work is being done on atomics,
so it highly unlikely that "no-speculative-stores-please" requirement
will change.



-- 
   Tomash Brechko