Re: Optimization of conditional access to globals: thread-unsafe?

Darryl Miles Sat, 27 Oct 2007 10:27:34 -0700


Comments inline below vvvvv



Tomash Brechko wrote:

Consider this piece of code:

    extern int v;

void

    f(int set_v)
    {
      if (set_v)
        v = 1;
    }

    f:
            pushl   %ebp
            movl    %esp, %ebp
            cmpl    $0, 8(%ebp)
            movl    $1, %eax
            cmove   v, %eax        ; load (maybe)
            movl    %eax, v        ; store (always)
            popl    %ebp
            ret

Note the last unconditional store to v.  Now, if some thread would
modify v between our load and store (acquiring the mutex first), then
we will overwrite the new value with the old one (and would do that in
a thread-unsafe manner, not acquiring the mutex).

So, do the calls to f(0) require the mutex, or it's a GCC bug?

The "unintended write-access" optimization is a massive headache fordevelopers of multi-threaded code.

The problem here is the mandatory write access to a memory location forwhich the as-written code path does not indicate a write memory accessshould occur.

This is a tricky one, optimizations which have the effect of causing an"unintended write access to some resource" when the code path does notintend this to happen crosses a line IMHO.

I think that GCC should understand where that line is and have a compiletime parameter to configure if that line maybe crossed. Its a matterfor debate as to what the default should be and/or -O6 should allow thatline to be crossed, but having no mechanism to control it is the realbummer.

Even if the interpretation offered of the C language standardsspecification says the line maybe be crossed, from a practical point ofview this is one aspect of optimization that a developer would want tohave complete control over.

So much control that I would also like to see a pair of__attribute__((optimization_hint_keywords)) attached to the variabledeclaration to provide fine grain control. Such a solution to theproblem would keep everybody happy.

Here are some pieces from C99:

...SNIP...

Sec 3.1 par 4: NOTE 3 Expressions that are not evaluated do not access
               objects.

Hmm... on this point there can be a problem. There are 2 major types ofaccess read from memory (load) and write to memory (store). It is verypossible to end up performing an optimistic read; only to throw away thevalue contained due to a compare/jump. This is usually considered asafe optimization.

But reading the statement above as-is and in the context of this problemmight make some believe this "optimistic read" optimization is breakingthe rules.



Maybe in GCC there should be C99 adherence levels :

strict mode: Where this C99 clause is adhered to, but this is much likecompiling code without optimization, like when debugging. Since duringdebugging you always want nice clear per line / per expressionseparation so you can walk through execution with a debugger.

may optimize read access mode: This is the normal case for optimization,where you might interleave a 'compare reg with immediate' and a 'loadfrom memory', then perform a 'conditional branch' that ends up at codethat never uses the value loaded from memory. The only rare case thisis a problem is where a read from special memory, but volatile in GCCexists for that or you could move all accesses to that memory away fromregular C language syntax and into a function call.

may optimize read and write access mode: This is the problem case youare seeing. Same as the mode above but also permits the unintendedwrite access, but only to write back the same value as before (based onthe compiler's thread naive perception of execution at least!).

So, could someone explain me why this GCC optimization is valid, and,
if so, where lies the boundary below which I may safely assume GCC
won't try to store to objects that aren't stored to explicitly during
particular execution path?  Or maybe the named bug report is valid
after all?

As has been pointed out by others there is no specification on whathappens between threads.



Your route out of this problem is to write your own implementation of:

atomic_int_set(int *ptr, int value);

Which always uses an atomic single instruction store. Which isthread-safe with respect to ensuring that no other concurrent read orwrite to that location will ever see a corrupted value. Where acorrupted value in this case would be some value other than "theprevious value of 'v'" and "the value of '1'" you are setting, also thatonce a concurrent access with "the value of '1'" is first obversed, itwill not be possible to observe the previous value on a subsequent read(the value doesn't flap about once it changes, it changes for good).


if (set_v)
  atomic_int_set(&v, 1);

By doing the above you are programatically dictating the method ofthread-safety in 2 directions.

One direction in terms of something that is agreeable with a compilerand something it can't optimize/change. By using a basic function callinvocation to an external symbol the compiler has no room to be able tothink about optimization. Since the compiler does not know the sideeffects that calling this external symbol may have. So it can't reorderthis operation either, so it occurred exactly at the moment your codesays it should.

The other direction is in terms of the target CPU assembler instructionsand architecture. The implementation of your atomic_int_set() will befixed by expressing the operation directly in assembler (to be sure italways will use a single instruction move register to memory).

It would also be true that you should also have a method for reading thecurrent value of 'v' in an atomic way.


This may also mean you have to create a function:

extern int atomic_int_get(int *);

For the purpose of obtaining the current value. Yes you have to applythe same care when reading the value as you do with setting it.

NB Marking the variable 'volatile' does not mean anything useful in thesituation you are in. The exact meaning of what 'volatile' is can be aproblem between compilers, but in the case of GCC it can stop there-ordering and the caching of value in register aspect of your entireproblem. But it will never enforce the method used to perform theload/store, not will it (at this time) stop the unintended write-access.Although in the case of an aligned integer of natural bitwidth it issomewhat difficult for the compiler to do the wrong thing on mostarchitectures, as the most efficient instruction is the atomicload/store between register and memory.



Darryl

Re: Optimization of conditional access to globals: thread-unsafe?

Reply via email to