On Mon, Nov 16, 2020 at 3:11 AM Peter Zijlstra wrote:
>
> XXX: I've only verified the below actually compiles, I've not verified
> the generated code is actually 'correct'.
Well, it was mainly the arm64 code generation for load-acquire and
store-release that wanted this - so it's really the
On Tue, Nov 17, 2020 at 11:25 AM Jakub Jelinek wrote:
>
> It would need to be typeof( (typeof(type)) (type) ) to not be that
> constrained on what kind of expressions it accepts as arguments.
Yup.
> Anyway, it won't work with array types at least,
> int a[10];
> typeof ((typeof (a)) (a)) b;
On Tue, Nov 17, 2020 at 11:13 AM Linus Torvalds
wrote:
>
> > +#define __unqual_typeof(type) typeof( (typeof(type))type )
>
> that's certainly a much nicer version than the existing pre-processor
> expansion from hell.
Oh, and sparse doesn't handle this, and doesn
On Sun, Feb 28, 2016 at 12:27 AM, Markus Trippelsdorf
wrote:
>> >
>> > -fno-strict-overflow
>>
>> -fno-strict-aliasing.
>
> Do not forget -fno-delete-null-pointer-checks.
>
> So the kernel obviously is already using its own C dialect, that is
> pretty far from standard C.
> All these options a
On Mon, Feb 29, 2016 at 9:37 AM, Michael Matz wrote:
>
>The important part is with induction variables controlling
> loops:
>
> short i; for (i = start; i < end; i++)
> vs.
> unsigned short u; for (u = start; u < end; u++)
>
> For the former you're allowed to assume that the loop will termina
On Tue, Jun 10, 2014 at 6:23 AM, Jiri Kosina wrote:
> We have been chasing a memory corruption bug, which turned out to be
> caused by very old gcc (4.3.4), which happily turned conditional load into
> a non-conditional one, and that broke correctness (the condition was met
> only if lock was held
On Mon, May 4, 2015 at 1:14 PM, H. Peter Anvin wrote:
>
> I would argue that for x86 what you actually want is to model the
> *conditions* that are available on the flags, not the flags themselves.
Yes. Otherwise it would be a nightmare to try to describe simple
conditions like "le", which a rath
On Mon, May 4, 2015 at 1:33 PM, Richard Henderson wrote:
>
> A fair point. Though honestly, I was hoping that this feature would mostly be
> used for conditions that are "weird" -- that is, not normally describable by
> arithmetic at all. Otherwise, why are you using inline asm for it?
I could
On Tue, May 5, 2015 at 6:50 AM, Segher Boessenkool
wrote:
>
> Since it is pre-processed, there is no real reason to overlap this with
> the constraints namespace; we could have e.g. "=@[xy]" (and "@[xy]" for
> inputs) mean the target needs to do some "xy" transform here.
In fact, standing out vis
On Tue, May 19, 2015 at 5:55 PM, Paul E. McKenney
wrote:
>
> http://www.rdrop.com/users/paulmck/RCU/consume.2015.05.18a.pdf
>From a very quick read-through, the restricted dependency chain in 7.9
seems to be reasonable, and essentially covers "thats' what hardware
gives us anyway", making
On Tue, May 19, 2015 at 6:57 PM, Linus Torvalds
wrote:
>
> - the "you can add/subtract integral values" still opens you up to
> language lawyers claiming "(char *)ptr - (intptr_t)ptr" preserving the
> dependency, which it clearly doesn't. But language
On Thu, May 21, 2015 at 1:02 PM, Paul E. McKenney
wrote:
>
> The compiler can (and does) speculate non-atomic non-volatile writes
> in some cases, but I do not believe that it is permitted to speculate
> either volatile or atomic writes.
I do *not* believe that a compiler is ever allowed to specu
On Wed, Feb 1, 2012 at 7:19 AM, Jan Kara wrote:
>
> we've spotted the following mismatch between what kernel folks expect
> from a compiler and what GCC really does, resulting in memory corruption on
> some architectures.
This is sad.
We've had something like this before due to architectural re
On Wed, Feb 1, 2012 at 8:37 AM, Colin Walters wrote:
>
> 1) Use the same lock for a given bitfield
That's not the problem. All the *bitfield* fields are all accessed
under the same word already.
> 2) Split up the bitfield into different words
Again, it's not the bitfield that is the problem.
T
On Wed, Feb 1, 2012 at 9:08 AM, Torvald Riegel wrote:
>
> What do the kernel folks think about the C11 memory model? If you can
> spot any issues in there, the GCC community would certainly like to
> know.
I don't think this is about memory models except very tangentially.
Gcc currently accesse
On Wed, Feb 1, 2012 at 9:11 AM, Jiri Kosina wrote:
> On Wed, 1 Feb 2012, Linus Torvalds wrote:
>>
>> And I suspect it really is a generic bug that can be shown even with
>> the above trivial example.
>
> I have actually tried exactly this earlier today (because while
On Wed, Feb 1, 2012 at 9:41 AM, Michael Matz wrote:
>
> One problem is that it's not a new problem, GCC emitted similar code since
> about forever, and still they turned up only now (well, probably because
> ia64 is dead, but sparc64 should have similar problems). The bitfield
> handling code is
On Wed, Feb 1, 2012 at 10:09 AM, David Miller wrote:
>
> Personally I've avoided C bitfields like the plague in any code I've
> written.
I do agree with that. The kernel largely tries to avoid bitfields,
usually because we have some really strict rules about different
bitfields, but also because
On Wed, Feb 1, 2012 at 10:45 AM, Jeff Law wrote:
>
> Torvald Riegel & I were told that was kernel policy when we brought up the
> upcoming bitfield semantic changes with some of the linux kernel folks last
> year.
Btw, one reason this is true is that the bitfield ordering/packing is
so unspecifie
On Wed, Feb 1, 2012 at 9:42 AM, Torvald Riegel wrote:
>
> We need a proper memory model.
Not really.
The fact is, the kernel will happily take the memory model of the
underlying hardware. Trying to impose some compiler description of the
memory model is actually horribly bad, because it automati
On Wed, Feb 1, 2012 at 11:40 AM, Jakub Jelinek wrote:
>
> Well, the C++11/C11 model doesn't allow to use the underlying type
> for accesses, consider e.g.
>
> struct S { long s1; unsigned int s2 : 5; unsigned int s3 : 19; unsigned char
> s4; unsigned int s5; };
> struct T { long s1 : 16; unsigned
On Wed, Feb 1, 2012 at 12:01 PM, Linus Torvalds
wrote:
>
> - However, while using the *smallest* possible access may generate
> correct code, it often generates really *crappy* code. Which is
> exactly the bug that I reported in
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?i
On Wed, Feb 1, 2012 at 12:16 PM, Jakub Jelinek wrote:
>>
>> So the kernel really doesn't care what you do to things *within* the
>> bitfield.
>
> But what is *within* the bitfield? Do you consider s4 or t2 fields
> (non-bitfield fields that just the ABI wants to pack together with
> the bitfield
On Wed, Feb 1, 2012 at 12:41 PM, Torvald Riegel wrote:
>
> You do rely on the compiler to do common transformations I suppose:
> hoist loads out of loops, CSE, etc. How do you expect the compiler to
> know whether they are allowed for a particular piece of code or not?
We have barriers.
Compile
On Wed, Feb 1, 2012 at 12:53 PM, Torvald Riegel wrote:
>
> For volatile, I agree.
>
> However, the original btrfs example was *without* a volatile, and that's
> why I raised the memory model point. This triggered an error in a
> concurrent execution, so that's memory model land, at least in C
> l
On Wed, Feb 1, 2012 at 1:24 PM, Torvald Riegel wrote:
>> It's not the only thing we do. We have cases where it's not that you
>> can't hoist things outside of loops, it's that you have to read things
>> exactly *once*, and then use that particular value (ie the compiler
>> can't be allowed to relo
On Wed, Feb 1, 2012 at 1:25 PM, Boehm, Hans wrote:
>
> Here are some more interesting ones that illustrate the issues (all
> declarations are non-local, unless stated otherwise):
>
> struct { char a; int b:9; int c:7; char d} x;
>
> Is x.b = 1 allowed to overwrite x.a? C11 says no, essentially r
On Wed, Feb 1, 2012 at 2:45 PM, Paul E. McKenney
wrote:
>
> My (perhaps forlorn and naive) hope is that C++11 memory_order_relaxed
> will eventually allow ACCESS_ONCE() to be upgraded so that (for example)
> access-once increments can generate a single increment-memory instruction
> on x86.
I don
On Thu, Feb 2, 2012 at 8:28 AM, Michael Matz wrote:
>
> Sure. Simplest example: struct s {int i:24;} __attribute__((packed)).
>
> You must access only three bytes, no matter what. The basetype (int) is
> four bytes.
Ok, so here's a really *stupid* (but also really really simple) patch attached.
On Thu, Feb 2, 2012 at 10:42 AM, Paul E. McKenney
wrote:
>>
>> SMP-atomic or percpu atomic? Or both?
>
> Only SMP-atomic.
And I assume that since the compiler does them, that would now make it
impossible for us to gather a list of all the 'lock' prefixes so that
we can undo them if it turns out t
On Fri, Feb 3, 2012 at 8:38 AM, Andrew MacLeod wrote:
>
> The atomic intrinsics were created for c++11 memory model compliance, but I
> am certainly open to enhancements that would make them more useful. I am
> planning some enhancements for 4.8 now, and it sounds like you may have some
> sugge
On Fri, Feb 3, 2012 at 11:16 AM, Andrew MacLeod wrote:
>> The special cases are because older x86 cannot do the generic
>> "add_return" efficiently - it needs xadd - but can do atomic versions
>> that test the end result and give zero or sign information.
>
> Since these are older x86 only, cou
On Thu, 19 Nov 2009, Thomas Gleixner wrote:
>
> standard function start:
>
>push %ebp
>mov%esp, %ebp
>
>call mcount
>
> modified function start on a handful of functions only seen with gcc
> 4.4.x on x86 32 bit:
>
> push %edi
> lea
On Thu, 19 Nov 2009, Richard Guenther wrote:
>
> Note that I only can reproduce the issue with
> -mincoming-stack-boundary=2, not with -mpreferred-stack-boundary=2.
Since you can reproduce it with -mincoming-stack-boundary=2, I woul
suggest just fixing mcount handling that way regardless of an
On Thu, 19 Nov 2009, Andrew Haley wrote:
>
> I've got all that off-list. I found the cause, and replied in another
> email. It's not a bug.
Oh Gods, are we back to gcc people saying "sure, we do stupid things, but
it's allowed, so we don't consider it a bug because it doesn't matter that
re
On Thu, 19 Nov 2009, Linus Torvalds wrote:
>
> Oh Gods, are we back to gcc people saying "sure, we do stupid things, but
> it's allowed, so we don't consider it a bug because it doesn't matter that
> real people care about real life, we only care about some
On Thu, 19 Nov 2009, Linus Torvalds wrote:
>
> I bet other people than just the kernel use the mcount hook for subtler
> things than just doing profiles. And even if they don't, the quoted code
> generation is just crazy _crap_.
For the kernel, if the only case is that t
On Thu, 19 Nov 2009, H. Peter Anvin wrote:
>
> Calling the profiler immediately at the entry point is clearly the more
> sane option. It means the ABI is well-defined, stable, and independent
> of what the actual function contents are. It means that ABI isn't the
> normal C ABI (the __fentry__
On Thu, 19 Nov 2009, Frederic Weisbecker wrote:
>
> > That way the lr would have the current function, and the parent would
> > still be at 8(%sp)
>
> Yeah right, we need at least such very tiny prologue for
> archs that store return addresses in a reg.
Well, it will be architecture-dependent.
On Fri, 20 Nov 2009, Thomas Gleixner wrote:
>
> While testing various kernel configs we found out that the problem
> comes and goes. Finally I started to compare the gcc command line
> options and after some fiddling it turned out that the following
> minimal deltas change the code generator beh
On Sun, Nov 14, 2010 at 4:52 PM, James Cloos wrote:
> Gcc 4.5.1 running on an amd64 box "cross"-compiling for a P3 i8k fails
> to compile the module since commit 6b4e81db2552bad04100e7d5ddeed7e848f53b48
> with:
>
> CC drivers/char/i8k.o
> drivers/char/i8k.c: In function ‘i8k_smm’:
> drivers/
On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek wrote:
>
> I don't see any problems on the assembly level. i8k_smm is
> not inlined in this case and checks all 3 conditions.
If it really is related to gcc not understanding that "*regs" has
changed due to the memory being an automatic variable, an
On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos wrote:
>
> Hmm, that doesn't work.
>
> [ Not sure if you read to whole thread but initial workaround was to
> change the asm(..) to asm volatile(..) which did work. ]
Since I have a different gcc than yours (and I'm not going to compile
my own), have you p
On Mon, Nov 15, 2010 at 10:30 AM, Jim Bos wrote:
>
> Attached version with plain 2.6.36 source and version with the committed
> patch, i.e with the '"+m" (*regs)'
Looks 100% identical in i8k_smm() itself, and I'm not seeing anything
bad. The asm has certainly not been optimized away as implied in
On Mon, Nov 15, 2010 at 10:45 AM, Jeff Law wrote:
>
> A memory clobber should clobber anything in memory, including autos in
> memory; if it doesn't, then that seems like a major problem. I'd like to
> see the rationale behind not clobbering autos in memory.
Yes. It turns out that the "asm optim
On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek wrote:
>
> Ah, the problem is that memory_identifier_string is only initialized in
> ipa-reference.c's initialization, so it can be (and is in this case) NULL in
> ipa-pure-const.c.
Ok. And I guess you can verify that all versions of gcc do this
cor
On Mon, Aug 5, 2013 at 9:55 AM, Steven Rostedt wrote:
>
> Almost a full year ago, Mathieu suggested something like:
>
> if (unlikely(x)) __attribute__((section(".unlikely"))) {
> ...
> } else __attribute__((section(".likely"))) {
> ...
> }
It's almost certainly a horrible idea.
F
On Mon, Aug 5, 2013 at 10:12 AM, Linus Torvalds
wrote:
>
> Secondly, you don't want a separate section anyway for any normal
> kernel code, since you want short jumps if possible
Just to clarify: the short jump is important regardless of how
unlikely the code you're jumpin
On Mon, Aug 5, 2013 at 10:55 AM, Steven Rostedt wrote:
>
> My main concern is with tracepoints. Which on 90% (or more) of systems
> running Linux, is completely off, and basically just dead code, until
> someone wants to see what's happening and enables them.
The static_key_false() approach with
On Mon, Aug 5, 2013 at 11:20 AM, Linus Torvalds
wrote:
>
> The static_key_false() approach with minimal inlining sounds like a
> much better approach overall.
Sorry, I misunderstood your thing. That's actually what you want that
section thing for, because right now you canno
On Mon, Aug 5, 2013 at 11:24 AM, Linus Torvalds
wrote:
>
> Ugh. I can see the attraction of your section thing for that case, I
> just get the feeling that we should be able to do better somehow.
Hmm.. Quite frankly, Steven, for your use case I think you actually
want the C got
On Mon, Aug 5, 2013 at 11:39 AM, Steven Rostedt wrote:
>
> I had patches that did exactly this:
>
> https://lkml.org/lkml/2012/3/8/461
>
> But it got dropped for some reason. I don't remember why. Maybe because
> of the complexity?
Ugh. Why the crazy update_jump_label script stuff? I'd go "Eww"
On Mon, Aug 5, 2013 at 11:51 AM, H. Peter Anvin wrote:
>>
>> Also, how would you pass the parameters? Every tracepoint has its own
>> parameters to pass to it. How would a trap know what where to get "prev"
>> and "next"?
>
> How do you do that now?
>
> You have to do an IP lookup to find out what
On Mon, Aug 5, 2013 at 12:04 PM, Andi Kleen wrote:
> Steven Rostedt writes:
>
> Can't you just use -freorder-blocks-and-partition?
>
> This should already partition unlikely blocks into a
> different section. Just a single one of course.
That's horrible. Not because of dwarf problems, but exactl
On Mon, Aug 5, 2013 at 12:40 PM, Marek Polacek wrote:
>
> FWIW, we also support hot/cold attributes for labels, thus e.g.
>
> if (bar ())
> goto A;
> /* ... */
> A: __attribute__((cold))
> /* ... */
>
> I don't know whether that might be useful for what you want or not though...
Steve?
On Mon, Aug 5, 2013 at 12:54 PM, Mathieu Desnoyers
wrote:
>
> I remember that choosing between 2 and 5 bytes nop in the asm goto was
> tricky: it had something to do with the fact that gcc doesn't know the
> exact size of each instructions until further down within compilation
Oh, you can't do it
On Tue, Aug 6, 2013 at 7:19 AM, Steven Rostedt wrote:
>
> After playing with the patches again, I now understand why I did that.
> It wasn't just for optimization.
[explanation snipped]
> Anyway, if you feel that update_jump_label is too complex, I can go the
> "update at early boot" route and s
On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel wrote:
>
> I wouldn't characterize the situation like this (although I can't speak
> for others, obviously). IMHO, it's perfectly fine on sequential /
> non-synchronizing code, because we know the difference isn't observable
> by a correct program.
On Sun, Feb 9, 2014 at 5:16 PM, Torvald Riegel wrote:
>
> (a) seems to say that you don't like requiring programmers to mark
> atomic accesses specially. Is that the case?
In Paul's example, they were marked specially.
And you seemed to argue that Paul's example could possibly return
anything b
On Sun, Feb 9, 2014 at 5:46 PM, Torvald Riegel wrote:
>
> IOW, I wrote that such a compiler transformation would be wrong in my
> opinion. Thus, it should *not* return 42.
Ahh, I am happy to have misunderstood. The "intuitively" threw me,
because I thought that was building up to a "but", and mi
On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel wrote:
>
> Intuitively, this is wrong because this let's the program take a step
> the abstract machine wouldn't do. This is different to the sequential
> code that Peter posted because it uses atomics, and thus one can't
> easily assume that the dif
On Wed, Feb 12, 2014 at 10:07 AM, Paul E. McKenney
wrote:
>
> Us Linux-kernel hackers will often need to use volatile semantics in
> combination with C11 atomics in most cases. The C11 atomics do cover
> some of the reasons we currently use ACCESS_ONCE(), but not all of them --
> in particular, i
On Fri, Feb 14, 2014 at 9:29 AM, Paul E. McKenney
wrote:
>
> Linus, Peter, any objections to marking places where we are relying on
> ordering from control dependencies against later stores? This approach
> seems to me to have significant documentation benefits.
Quite frankly, I think it's stupi
On Fri, Feb 14, 2014 at 11:50 AM, Linus Torvalds
wrote:
>
> Why are we still discussing this idiocy? It's irrelevant. If the
> standard really allows random store speculation, the standard doesn't
> matter, and sane people shouldn't waste their time arguing about it.
On Fri, Feb 14, 2014 at 6:08 PM, Paul E. McKenney
wrote:
>
> One way of looking at the discussion between Torvald and myself would be
> as a seller (Torvald) and a buyer (me) haggling over the fine print in
> a proposed contract (the standard). Whether that makes you feel better
> or worse about
On Fri, Feb 14, 2014 at 6:44 PM, Linus Torvalds
wrote:
>
> And conversely, the C11 people can walk away from us too. But if they
> can't make us happy (and by "make us happy", I really mean no stupid
> games on our part) I personally think they'll have a stronger
On Sat, Feb 15, 2014 at 9:45 AM, Torvald Riegel wrote:
>
> I think a major benefit of C11's memory model is that it gives a
> *precise* specification for how a compiler is allowed to optimize.
Clearly it does *not*. This whole discussion is proof of that. It's
not at all clear, and the standard a
On Sat, Feb 15, 2014 at 9:30 AM, Torvald Riegel wrote:
>
> I think the example is easy to misunderstand, because the context isn't
> clear. Therefore, let me first try to clarify the background.
>
> (1) The abstract machine does not write speculatively.
> (2) Emitting a branch instruction and exe
On Mon, Feb 17, 2014 at 11:55 AM, Torvald Riegel wrote:
>
> Which example do you have in mind here? Haven't we resolved all the
> debated examples, or did I miss any?
Well, Paul seems to still think that the standard possibly allows
speculative writes or possibly value speculation in ways that b
On Mon, Feb 17, 2014 at 1:21 PM, Torvald Riegel wrote:
> On Mon, 2014-02-17 at 12:18 -0800, Linus Torvalds wrote:
>> and then it is read by people (compiler writers) that intentionally
>> try to mis-use the words and do language-lawyering ("that depends on
>> w
On Mon, Feb 17, 2014 at 2:09 PM, Torvald Riegel wrote:
> On Sat, 2014-02-15 at 11:15 -0800, Linus Torvalds wrote:
>> >
>> > if (atomic_load(&x, mo_relaxed) == 1)
>> > atomic_store(&y, 3, mo_relaxed));
>>
>> No, please don't use this
On Mon, Feb 17, 2014 at 2:25 PM, Torvald Riegel wrote:
> On Mon, 2014-02-17 at 14:02 -0800, Linus Torvalds wrote:
>>
>> The argument was that an lvalue doesn't actually "access" the memory
>> (an rvalue does), so this:
>>
>>volatile int *p = ..
On Mon, Feb 17, 2014 at 3:10 PM, Alec Teal wrote:
>
> You mean "unambiguous" - try reading a patent (Apple have 1000s of trivial
> ones, I tried reading one once thinking "how could they have phrased it so
> this got approved", their technique was to make the reader want to start
> cutting themsel
On Mon, Feb 17, 2014 at 3:17 PM, Torvald Riegel wrote:
> On Mon, 2014-02-17 at 14:32 -0800,
>
>> Stop claiming it "can return 1".. It *never* returns 1 unless you do
>> the load and *verify* it, or unless the load itself can be made to go
>> away. And with the code sequence given, that just doesn'
On Mon, Feb 17, 2014 at 3:41 PM, Torvald Riegel wrote:
>
> There's an underlying problem here that's independent from the actual
> instance that you're worried about here: "no sense" is a ultimately a
> matter of taste/objectives/priorities as long as the respective
> specification is logically co
On Mon, Feb 17, 2014 at 7:00 PM, Paul E. McKenney
wrote:
>
> One example that I learned about last week uses the branch-prediction
> hardware to validate value speculation. And no, I am not at all a fan
> of value speculation, in case you were curious.
Heh. See the example I used in my reply to
On Mon, Feb 17, 2014 at 7:24 PM, Linus Torvalds
wrote:
>
> As far as I can tell, the intent is that you can't do value
> speculation (except perhaps for the "relaxed", which quite frankly
> sounds largely useless).
Hmm. The language I see for "consume" is n
On Tue, Feb 18, 2014 at 7:31 AM, Torvald Riegel wrote:
> On Mon, 2014-02-17 at 16:05 -0800, Linus Torvalds wrote:
>> And exactly because I know enough, I would *really* like atomics to be
>> well-defined, and have very clear - and *local* - rules about how they
>> can be c
On Tue, Feb 18, 2014 at 4:12 AM, Peter Sewell wrote:
>
> For example, suppose we have, in one compilation unit:
>
> void f(int ra, int*rb) {
> if (ra==42)
> *rb=42;
> else
> *rb=42;
> }
So this is a great example, and in general I really like your page at:
> F
On Tue, Feb 18, 2014 at 8:17 AM, Torvald Riegel wrote:
>>
>> "Consume operation: no reads in the current thread dependent on the
>> value currently loaded can be reordered before this load"
>
> I can't remember seeing that language in the standard (ie, C or C++).
> Where is this from?
That's ju
On Tue, Feb 18, 2014 at 10:21 AM, Peter Sewell
wrote:
>
> This is a bit more subtle, because (on ARM and POWER) removing the
> dependency and conditional branch is actually in general *not* equivalent
> in the hardware, in a concurrent context.
So I agree, but I think that's a generic issue with
On Tue, Feb 18, 2014 at 10:23 AM, Peter Sewell
wrote:
>
> interesting list. So are you saying that value-range-analysis and
> such-like (I say glibly, without really knowing what "such-like"
> refers to here) are fundamentally incompatible with
> the kernel code
No, it's fine to do things like v
On Tue, Feb 18, 2014 at 1:21 PM, Torvald Riegel wrote:
>>
>> So imagine that you have some clever global optimizer that sees that
>> the program never ever actually sets the dirty bit at all in any
>> thread, and then uses that kind of non-local knowledge to make
>> optimization decisions. THAT WO
On Wed, Feb 19, 2014 at 6:40 AM, Torvald Riegel wrote:
>
> If all those other threads written in whichever way use the same memory
> model and ABI for synchronization (e.g., choice of HW barriers for a
> certain memory_order), it doesn't matter whether it's a hardware thread,
> microcode, whatever
On Tue, Feb 18, 2014 at 11:47 AM, Torvald Riegel wrote:
> On Tue, 2014-02-18 at 09:44 -0800, Linus Torvalds wrote:
>>
>> Can you point to it? Because I can find a draft standard, and it sure
>> as hell does *not* contain any clarity of the model. It has a *lot* of
>>
On Wed, Feb 19, 2014 at 8:01 PM, Paul E. McKenney
wrote:
>
> The control dependency should order subsequent stores, at least assuming
> that "a" and "b" don't start off with identical stores that the compiler
> could pull out of the "if" and merge. The same might also be true for ?:
> for all I k
On Thu, Feb 20, 2014 at 12:30 AM, Paul E. McKenney
wrote:
>>
>> So lets make this really simple: if you have a consume->cmp->read, is
>> the ordering of the two reads guaranteed?
>
> Not as far as I know. Also, as far as I know, there is no difference
> between consume and relaxed in the consume-
On Thu, Feb 20, 2014 at 9:14 AM, Torvald Riegel wrote:
>>
>> So the clarification is basically to the statement that the "if
>> (consume(p)) a" version *would* have an ordering guarantee between the
>> read of "p" and "a", but the "consume(p) ? a : b" would *not* have
>> such an ordering guarantee
On Thu, Feb 20, 2014 at 9:49 AM, Torvald Riegel wrote:
>
> Yes, mo_consume is more tricky than mo_acquire.
>
> However, that has an advantage because you can avoid getting stronger
> barriers if you don't need them (ie, you can avoid the "auto-update to
> acquire" you seem to have in mind).
Oh, I
On Thu, Feb 20, 2014 at 10:11 AM, Paul E. McKenney
wrote:
>
> You really need that "consume" to be "acquire".
So I think we now all agree that that is what the standard is saying.
And I'm saying that that is wrong, that the standard is badly written,
and should be fixed.
Because before the stan
On Thu, Feb 20, 2014 at 10:25 AM, Linus Torvalds
wrote:
>
> While in my *sane* model, where you can consume things even if they
> then result in control dependencies, there will still eventually be a
> "sync" instruction on powerpc (because you really need one between the
On Thu, Feb 20, 2014 at 11:02 AM, Linus Torvalds
wrote:
>
> Again, the way I'd expect a compiler writer to actually *do* this is
> to just default to "ac
Oops, pressed send by mistake too early.
I was almost done:
I'd expect a compiler to just default to "acquir
On Thu, Feb 20, 2014 at 10:53 AM, Torvald Riegel wrote:
> On Thu, 2014-02-20 at 10:32 -0800, Linus Torvalds wrote:
>> On Thu, Feb 20, 2014 at 10:11 AM, Paul E. McKenney
>> wrote:
>> >
>> > You really need that "consume" to be "acquire".
&g
On Thu, Feb 20, 2014 at 10:56 AM, Paul E. McKenney
wrote:
>
> The example gcc breakage was something like this:
>
> i = atomic_load(idx, memory_order_consume);
> x = array[0 + i - i];
>
> Then gcc optimized this to:
>
> i = atomic_load(idx, memory_order_consume);
>
On Thu, Feb 20, 2014 at 2:10 PM, Paul E. McKenney
wrote:
>
> Linus, given that you are calling me out for pushing "legalistic and bad"
> things, "syntactic bullshit", and playing "little games", I am forced
> to conclude that you have never attended any sort of standards-committee
> meeting. ;-)
On Fri, Feb 21, 2014 at 10:25 AM, Peter Sewell
wrote:
>
> If one thinks this is too fragile, then simply using memory_order_acquire
> and paying the resulting barrier cost (and perhaps hoping that compilers
> will eventually be able to optimise some cases of those barriers to
> hardware-level depe
On Fri, Feb 21, 2014 at 11:16 AM, Linus Torvalds
wrote:
>
> Why would this be any different, especially since it's easy to
> understand both for a human and a compiler?
Btw, the actual data path may actually be semantically meaningful even
at a processor level.
For example, let&
On Fri, Feb 21, 2014 at 11:43 AM, Peter Sewell
wrote:
>
> You have to track dependencies through other assignments, e.g. simple x=y
That is all visible in the SSA form. Variable assignment has been
converted to some use of the SSA node that generated the value. The
use might be a phi node or a ca
On Sat, Feb 22, 2014 at 10:53 AM, Torvald Riegel wrote:
>
> Stating that (1) "the standard is wrong" and (2) that you think that
> mo_consume semantics are not good is two different things.
I do agree. They are two independent things.
I think the standard is wrong, because it's overly complex, h
On Sat, Feb 22, 2014 at 4:39 PM, Paul E. McKenney
wrote:
>
> Agreed, by far the most frequent use is "->" to dereference and assignment
> to store into a local variable. The other operations where the kernel
> expects ordering to be maintained are:
>
> o Bitwise "&" to strip off low-order b
1 - 100 of 128 matches
Mail list logo