from:"Linus Torvalds"

Re: Re: typeof and operands in named address spaces

2020-11-17 Thread Linus Torvalds

On Mon, Nov 16, 2020 at 3:11 AM Peter Zijlstra wrote: > > XXX: I've only verified the below actually compiles, I've not verified > the generated code is actually 'correct'. Well, it was mainly the arm64 code generation for load-acquire and store-release that wanted this - so it's really the

Re: Re: typeof and operands in named address spaces

2020-11-17 Thread Linus Torvalds

On Tue, Nov 17, 2020 at 11:25 AM Jakub Jelinek wrote: > > It would need to be typeof( (typeof(type)) (type) ) to not be that > constrained on what kind of expressions it accepts as arguments. Yup. > Anyway, it won't work with array types at least, > int a[10]; > typeof ((typeof (a)) (a)) b;

Re: Re: typeof and operands in named address spaces

2020-11-17 Thread Linus Torvalds

On Tue, Nov 17, 2020 at 11:13 AM Linus Torvalds wrote: > > > +#define __unqual_typeof(type) typeof( (typeof(type))type ) > > that's certainly a much nicer version than the existing pre-processor > expansion from hell. Oh, and sparse doesn't handle this, and doesn

Re: [isocpp-parallel] Proposal for new memory_order_consume definition

2016-02-28 Thread Linus Torvalds

On Sun, Feb 28, 2016 at 12:27 AM, Markus Trippelsdorf wrote: >> > >> > -fno-strict-overflow >> >> -fno-strict-aliasing. > > Do not forget -fno-delete-null-pointer-checks. > > So the kernel obviously is already using its own C dialect, that is > pretty far from standard C. > All these options a

Re: [isocpp-parallel] Proposal for new memory_order_consume definition

2016-02-29 Thread Linus Torvalds

On Mon, Feb 29, 2016 at 9:37 AM, Michael Matz wrote: > >The important part is with induction variables controlling > loops: > > short i; for (i = start; i < end; i++) > vs. > unsigned short u; for (u = start; u < end; u++) > > For the former you're allowed to assume that the loop will termina

Re: [PATCH] tell gcc optimizer to never introduce new data races

2014-06-10 Thread Linus Torvalds

On Tue, Jun 10, 2014 at 6:23 AM, Jiri Kosina wrote: > We have been chasing a memory corruption bug, which turned out to be > caused by very old gcc (4.3.4), which happily turned conditional load into > a non-conditional one, and that broke correctness (the condition was met > only if lock was held

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Linus Torvalds

On Mon, May 4, 2015 at 1:14 PM, H. Peter Anvin wrote: > > I would argue that for x86 what you actually want is to model the > *conditions* that are available on the flags, not the flags themselves. Yes. Otherwise it would be a nightmare to try to describe simple conditions like "le", which a rath

Re: [RFC] Design for flag bit outputs from asms

2015-05-04 Thread Linus Torvalds

On Mon, May 4, 2015 at 1:33 PM, Richard Henderson wrote: > > A fair point. Though honestly, I was hoping that this feature would mostly be > used for conditions that are "weird" -- that is, not normally describable by > arithmetic at all. Otherwise, why are you using inline asm for it? I could

Re: [RFC] Design for flag bit outputs from asms

2015-05-05 Thread Linus Torvalds

On Tue, May 5, 2015 at 6:50 AM, Segher Boessenkool wrote: > > Since it is pre-processed, there is no real reason to overlap this with > the constraints namespace; we could have e.g. "=@[xy]" (and "@[xy]" for > inputs) mean the target needs to do some "xy" transform here. In fact, standing out vis

Re: Compilers and RCU readers: Once more unto the breach!

2015-05-19 Thread Linus Torvalds

On Tue, May 19, 2015 at 5:55 PM, Paul E. McKenney wrote: > > http://www.rdrop.com/users/paulmck/RCU/consume.2015.05.18a.pdf >From a very quick read-through, the restricted dependency chain in 7.9 seems to be reasonable, and essentially covers "thats' what hardware gives us anyway", making

Re: Compilers and RCU readers: Once more unto the breach!

2015-05-19 Thread Linus Torvalds

On Tue, May 19, 2015 at 6:57 PM, Linus Torvalds wrote: > > - the "you can add/subtract integral values" still opens you up to > language lawyers claiming "(char *)ptr - (intptr_t)ptr" preserving the > dependency, which it clearly doesn't. But language

Re: Compilers and RCU readers: Once more unto the breach!

2015-05-21 Thread Linus Torvalds

On Thu, May 21, 2015 at 1:02 PM, Paul E. McKenney wrote: > > The compiler can (and does) speculate non-atomic non-volatile writes > in some cases, but I do not believe that it is permitted to speculate > either volatile or atomic writes. I do *not* believe that a compiler is ever allowed to specu

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 7:19 AM, Jan Kara wrote: > > we've spotted the following mismatch between what kernel folks expect > from a compiler and what GCC really does, resulting in memory corruption on > some architectures. This is sad. We've had something like this before due to architectural re

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 8:37 AM, Colin Walters wrote: > > 1) Use the same lock for a given bitfield That's not the problem. All the *bitfield* fields are all accessed under the same word already. > 2) Split up the bitfield into different words Again, it's not the bitfield that is the problem. T

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 9:08 AM, Torvald Riegel wrote: > > What do the kernel folks think about the C11 memory model? If you can > spot any issues in there, the GCC community would certainly like to > know. I don't think this is about memory models except very tangentially. Gcc currently accesse

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 9:11 AM, Jiri Kosina wrote: > On Wed, 1 Feb 2012, Linus Torvalds wrote: >> >> And I suspect it really is a generic bug that can be shown even with >> the above trivial example. > > I have actually tried exactly this earlier today (because while

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 9:41 AM, Michael Matz wrote: > > One problem is that it's not a new problem, GCC emitted similar code since > about forever, and still they turned up only now (well, probably because > ia64 is dead, but sparc64 should have similar problems). The bitfield > handling code is

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 10:09 AM, David Miller wrote: > > Personally I've avoided C bitfields like the plague in any code I've > written. I do agree with that. The kernel largely tries to avoid bitfields, usually because we have some really strict rules about different bitfields, but also because

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 10:45 AM, Jeff Law wrote: > > Torvald Riegel & I were told that was kernel policy when we brought up the > upcoming bitfield semantic changes with some of the linux kernel folks last > year. Btw, one reason this is true is that the bitfield ordering/packing is so unspecifie

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 9:42 AM, Torvald Riegel wrote: > > We need a proper memory model. Not really. The fact is, the kernel will happily take the memory model of the underlying hardware. Trying to impose some compiler description of the memory model is actually horribly bad, because it automati

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 11:40 AM, Jakub Jelinek wrote: > > Well, the C++11/C11 model doesn't allow to use the underlying type > for accesses, consider e.g. > > struct S { long s1; unsigned int s2 : 5; unsigned int s3 : 19; unsigned char > s4; unsigned int s5; }; > struct T { long s1 : 16; unsigned

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 12:01 PM, Linus Torvalds wrote: > > - However, while using the *smallest* possible access may generate > correct code, it often generates really *crappy* code. Which is > exactly the bug that I reported in > > http://gcc.gnu.org/bugzilla/show_bug.cgi?i

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 12:16 PM, Jakub Jelinek wrote: >> >> So the kernel really doesn't care what you do to things *within* the >> bitfield. > > But what is *within* the bitfield? Do you consider s4 or t2 fields > (non-bitfield fields that just the ABI wants to pack together with > the bitfield

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 12:41 PM, Torvald Riegel wrote: > > You do rely on the compiler to do common transformations I suppose: > hoist loads out of loops, CSE, etc. How do you expect the compiler to > know whether they are allowed for a particular piece of code or not? We have barriers. Compile

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 12:53 PM, Torvald Riegel wrote: > > For volatile, I agree. > > However, the original btrfs example was *without* a volatile, and that's > why I raised the memory model point. This triggered an error in a > concurrent execution, so that's memory model land, at least in C > l

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 1:24 PM, Torvald Riegel wrote: >> It's not the only thing we do. We have cases where it's not that you >> can't hoist things outside of loops, it's that you have to read things >> exactly *once*, and then use that particular value (ie the compiler >> can't be allowed to relo

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 1:25 PM, Boehm, Hans wrote: > > Here are some more interesting ones that illustrate the issues (all > declarations are non-local, unless stated otherwise): > > struct { char a; int b:9; int c:7; char d} x; > > Is x.b = 1 allowed to overwrite x.a? C11 says no, essentially r

Re: Memory corruption due to word sharing

2012-02-01 Thread Linus Torvalds

On Wed, Feb 1, 2012 at 2:45 PM, Paul E. McKenney wrote: > > My (perhaps forlorn and naive) hope is that C++11 memory_order_relaxed > will eventually allow ACCESS_ONCE() to be upgraded so that (for example) > access-once increments can generate a single increment-memory instruction > on x86. I don

Re: Memory corruption due to word sharing

2012-02-02 Thread Linus Torvalds

On Thu, Feb 2, 2012 at 8:28 AM, Michael Matz wrote: > > Sure. Simplest example: struct s {int i:24;} __attribute__((packed)). > > You must access only three bytes, no matter what. The basetype (int) is > four bytes. Ok, so here's a really *stupid* (but also really really simple) patch attached.

Re: Memory corruption due to word sharing

2012-02-02 Thread Linus Torvalds

On Thu, Feb 2, 2012 at 10:42 AM, Paul E. McKenney wrote: >> >> SMP-atomic or percpu atomic? Or both? > > Only SMP-atomic. And I assume that since the compiler does them, that would now make it impossible for us to gather a list of all the 'lock' prefixes so that we can undo them if it turns out t

Re: Memory corruption due to word sharing

2012-02-03 Thread Linus Torvalds

On Fri, Feb 3, 2012 at 8:38 AM, Andrew MacLeod wrote: > > The atomic intrinsics were created for c++11 memory model compliance, but I > am certainly open to enhancements that would make them more useful. I am > planning some enhancements for 4.8 now, and it sounds like you may have some > sugge

Re: Memory corruption due to word sharing

2012-02-03 Thread Linus Torvalds

On Fri, Feb 3, 2012 at 11:16 AM, Andrew MacLeod wrote: >> The special cases are because older x86 cannot do the generic >> "add_return" efficiently - it needs xadd - but can do atomic versions >> that test the end result and give zero or sign information. > > Since these are older x86 only, cou

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Thomas Gleixner wrote: > > standard function start: > >push %ebp >mov%esp, %ebp > >call mcount > > modified function start on a handful of functions only seen with gcc > 4.4.x on x86 32 bit: > > push %edi > lea

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Richard Guenther wrote: > > Note that I only can reproduce the issue with > -mincoming-stack-boundary=2, not with -mpreferred-stack-boundary=2. Since you can reproduce it with -mincoming-stack-boundary=2, I woul suggest just fixing mcount handling that way regardless of an

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Andrew Haley wrote: > > I've got all that off-list. I found the cause, and replied in another > email. It's not a bug. Oh Gods, are we back to gcc people saying "sure, we do stupid things, but it's allowed, so we don't consider it a bug because it doesn't matter that re

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Linus Torvalds wrote: > > Oh Gods, are we back to gcc people saying "sure, we do stupid things, but > it's allowed, so we don't consider it a bug because it doesn't matter that > real people care about real life, we only care about some

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Linus Torvalds wrote: > > I bet other people than just the kernel use the mcount hook for subtler > things than just doing profiles. And even if they don't, the quoted code > generation is just crazy _crap_. For the kernel, if the only case is that t

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, H. Peter Anvin wrote: > > Calling the profiler immediately at the entry point is clearly the more > sane option. It means the ABI is well-defined, stable, and independent > of what the actual function contents are. It means that ABI isn't the > normal C ABI (the __fentry__

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Thu, 19 Nov 2009, Frederic Weisbecker wrote: > > > That way the lr would have the current function, and the parent would > > still be at 8(%sp) > > Yeah right, we need at least such very tiny prologue for > archs that store return addresses in a reg. Well, it will be architecture-dependent.

Re: BUG: GCC-4.4.x changes the function frame on some functions

2009-11-19 Thread Linus Torvalds

On Fri, 20 Nov 2009, Thomas Gleixner wrote: > > While testing various kernel configs we found out that the problem > comes and goes. Finally I started to compare the gcc command line > options and after some fiddling it turned out that the following > minimal deltas change the code generator beh

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-14 Thread Linus Torvalds

On Sun, Nov 14, 2010 at 4:52 PM, James Cloos wrote: > Gcc 4.5.1 running on an amd64 box "cross"-compiling for a P3 i8k fails > to compile the module since commit 6b4e81db2552bad04100e7d5ddeed7e848f53b48 > with: > > CC drivers/char/i8k.o > drivers/char/i8k.c: In function ‘i8k_smm’: > drivers/

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds

On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek wrote: > > I don't see any problems on the assembly level. i8k_smm is > not inlined in this case and checks all 3 conditions. If it really is related to gcc not understanding that "*regs" has changed due to the memory being an automatic variable, an

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds

On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos wrote: > > Hmm, that doesn't work. > > [ Not sure if you read to whole thread but initial workaround was to > change the asm(..) to asm volatile(..) which did work. ] Since I have a different gcc than yours (and I'm not going to compile my own), have you p

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds

On Mon, Nov 15, 2010 at 10:30 AM, Jim Bos wrote: > > Attached version with plain 2.6.36 source and version with the committed > patch, i.e with the '"+m" (*regs)' Looks 100% identical in i8k_smm() itself, and I'm not seeing anything bad. The asm has certainly not been optimized away as implied in

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds

On Mon, Nov 15, 2010 at 10:45 AM, Jeff Law wrote: > > A memory clobber should clobber anything in memory, including autos in > memory; if it doesn't, then that seems like a major problem. I'd like to > see the rationale behind not clobbering autos in memory. Yes. It turns out that the "asm optim

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Linus Torvalds

On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek wrote: > > Ah, the problem is that memory_identifier_string is only initialized in > ipa-reference.c's initialization, so it can be (and is in this case) NULL in > ipa-pure-const.c. Ok. And I guess you can verify that all versions of gcc do this cor

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 9:55 AM, Steven Rostedt wrote: > > Almost a full year ago, Mathieu suggested something like: > > if (unlikely(x)) __attribute__((section(".unlikely"))) { > ... > } else __attribute__((section(".likely"))) { > ... > } It's almost certainly a horrible idea. F

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 10:12 AM, Linus Torvalds wrote: > > Secondly, you don't want a separate section anyway for any normal > kernel code, since you want short jumps if possible Just to clarify: the short jump is important regardless of how unlikely the code you're jumpin

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 10:55 AM, Steven Rostedt wrote: > > My main concern is with tracepoints. Which on 90% (or more) of systems > running Linux, is completely off, and basically just dead code, until > someone wants to see what's happening and enables them. The static_key_false() approach with

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:20 AM, Linus Torvalds wrote: > > The static_key_false() approach with minimal inlining sounds like a > much better approach overall. Sorry, I misunderstood your thing. That's actually what you want that section thing for, because right now you canno

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:24 AM, Linus Torvalds wrote: > > Ugh. I can see the attraction of your section thing for that case, I > just get the feeling that we should be able to do better somehow. Hmm.. Quite frankly, Steven, for your use case I think you actually want the C got

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:39 AM, Steven Rostedt wrote: > > I had patches that did exactly this: > > https://lkml.org/lkml/2012/3/8/461 > > But it got dropped for some reason. I don't remember why. Maybe because > of the complexity? Ugh. Why the crazy update_jump_label script stuff? I'd go "Eww"

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 11:51 AM, H. Peter Anvin wrote: >> >> Also, how would you pass the parameters? Every tracepoint has its own >> parameters to pass to it. How would a trap know what where to get "prev" >> and "next"? > > How do you do that now? > > You have to do an IP lookup to find out what

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 12:04 PM, Andi Kleen wrote: > Steven Rostedt writes: > > Can't you just use -freorder-blocks-and-partition? > > This should already partition unlikely blocks into a > different section. Just a single one of course. That's horrible. Not because of dwarf problems, but exactl

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 12:40 PM, Marek Polacek wrote: > > FWIW, we also support hot/cold attributes for labels, thus e.g. > > if (bar ()) > goto A; > /* ... */ > A: __attribute__((cold)) > /* ... */ > > I don't know whether that might be useful for what you want or not though... Steve?

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-05 Thread Linus Torvalds

On Mon, Aug 5, 2013 at 12:54 PM, Mathieu Desnoyers wrote: > > I remember that choosing between 2 and 5 bytes nop in the asm goto was > tricky: it had something to do with the fact that gcc doesn't know the > exact size of each instructions until further down within compilation Oh, you can't do it

Re: [RFC] gcc feature request: Moving blocks into sections

2013-08-06 Thread Linus Torvalds

On Tue, Aug 6, 2013 at 7:19 AM, Steven Rostedt wrote: > > After playing with the patches again, I now understand why I did that. > It wasn't just for optimization. [explanation snipped] > Anyway, if you feel that update_jump_label is too complex, I can go the > "update at early boot" route and s

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-09 Thread Linus Torvalds

On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel wrote: > > I wouldn't characterize the situation like this (although I can't speak > for others, obviously). IMHO, it's perfectly fine on sequential / > non-synchronizing code, because we know the difference isn't observable > by a correct program.

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-09 Thread Linus Torvalds

On Sun, Feb 9, 2014 at 5:16 PM, Torvald Riegel wrote: > > (a) seems to say that you don't like requiring programmers to mark > atomic accesses specially. Is that the case? In Paul's example, they were marked specially. And you seemed to argue that Paul's example could possibly return anything b

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-09 Thread Linus Torvalds

On Sun, Feb 9, 2014 at 5:46 PM, Torvald Riegel wrote: > > IOW, I wrote that such a compiler transformation would be wrong in my > opinion. Thus, it should *not* return 42. Ahh, I am happy to have misunderstood. The "intuitively" threw me, because I thought that was building up to a "but", and mi

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-10 Thread Linus Torvalds

On Sun, Feb 9, 2014 at 4:27 PM, Torvald Riegel wrote: > > Intuitively, this is wrong because this let's the program take a step > the abstract machine wouldn't do. This is different to the sequential > code that Peter posted because it uses atomics, and thus one can't > easily assume that the dif

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-12 Thread Linus Torvalds

On Wed, Feb 12, 2014 at 10:07 AM, Paul E. McKenney wrote: > > Us Linux-kernel hackers will often need to use volatile semantics in > combination with C11 atomics in most cases. The C11 atomics do cover > some of the reasons we currently use ACCESS_ONCE(), but not all of them -- > in particular, i

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-14 Thread Linus Torvalds

On Fri, Feb 14, 2014 at 9:29 AM, Paul E. McKenney wrote: > > Linus, Peter, any objections to marking places where we are relying on > ordering from control dependencies against later stores? This approach > seems to me to have significant documentation benefits. Quite frankly, I think it's stupi

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-14 Thread Linus Torvalds

On Fri, Feb 14, 2014 at 11:50 AM, Linus Torvalds wrote: > > Why are we still discussing this idiocy? It's irrelevant. If the > standard really allows random store speculation, the standard doesn't > matter, and sane people shouldn't waste their time arguing about it.

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-14 Thread Linus Torvalds

On Fri, Feb 14, 2014 at 6:08 PM, Paul E. McKenney wrote: > > One way of looking at the discussion between Torvald and myself would be > as a seller (Torvald) and a buyer (me) haggling over the fine print in > a proposed contract (the standard). Whether that makes you feel better > or worse about

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-14 Thread Linus Torvalds

On Fri, Feb 14, 2014 at 6:44 PM, Linus Torvalds wrote: > > And conversely, the C11 people can walk away from us too. But if they > can't make us happy (and by "make us happy", I really mean no stupid > games on our part) I personally think they'll have a stronger

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-15 Thread Linus Torvalds

On Sat, Feb 15, 2014 at 9:45 AM, Torvald Riegel wrote: > > I think a major benefit of C11's memory model is that it gives a > *precise* specification for how a compiler is allowed to optimize. Clearly it does *not*. This whole discussion is proof of that. It's not at all clear, and the standard a

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-15 Thread Linus Torvalds

On Sat, Feb 15, 2014 at 9:30 AM, Torvald Riegel wrote: > > I think the example is easy to misunderstand, because the context isn't > clear. Therefore, let me first try to clarify the background. > > (1) The abstract machine does not write speculatively. > (2) Emitting a branch instruction and exe

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 11:55 AM, Torvald Riegel wrote: > > Which example do you have in mind here? Haven't we resolved all the > debated examples, or did I miss any? Well, Paul seems to still think that the standard possibly allows speculative writes or possibly value speculation in ways that b

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 1:21 PM, Torvald Riegel wrote: > On Mon, 2014-02-17 at 12:18 -0800, Linus Torvalds wrote: >> and then it is read by people (compiler writers) that intentionally >> try to mis-use the words and do language-lawyering ("that depends on >> w

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 2:09 PM, Torvald Riegel wrote: > On Sat, 2014-02-15 at 11:15 -0800, Linus Torvalds wrote: >> > >> > if (atomic_load(&x, mo_relaxed) == 1) >> > atomic_store(&y, 3, mo_relaxed)); >> >> No, please don't use this

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 2:25 PM, Torvald Riegel wrote: > On Mon, 2014-02-17 at 14:02 -0800, Linus Torvalds wrote: >> >> The argument was that an lvalue doesn't actually "access" the memory >> (an rvalue does), so this: >> >>volatile int *p = ..

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 3:10 PM, Alec Teal wrote: > > You mean "unambiguous" - try reading a patent (Apple have 1000s of trivial > ones, I tried reading one once thinking "how could they have phrased it so > this got approved", their technique was to make the reader want to start > cutting themsel

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 3:17 PM, Torvald Riegel wrote: > On Mon, 2014-02-17 at 14:32 -0800, > >> Stop claiming it "can return 1".. It *never* returns 1 unless you do >> the load and *verify* it, or unless the load itself can be made to go >> away. And with the code sequence given, that just doesn'

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 3:41 PM, Torvald Riegel wrote: > > There's an underlying problem here that's independent from the actual > instance that you're worried about here: "no sense" is a ultimately a > matter of taste/objectives/priorities as long as the respective > specification is logically co

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 7:00 PM, Paul E. McKenney wrote: > > One example that I learned about last week uses the branch-prediction > hardware to validate value speculation. And no, I am not at all a fan > of value speculation, in case you were curious. Heh. See the example I used in my reply to

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-17 Thread Linus Torvalds

On Mon, Feb 17, 2014 at 7:24 PM, Linus Torvalds wrote: > > As far as I can tell, the intent is that you can't do value > speculation (except perhaps for the "relaxed", which quite frankly > sounds largely useless). Hmm. The language I see for "consume" is n

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 7:31 AM, Torvald Riegel wrote: > On Mon, 2014-02-17 at 16:05 -0800, Linus Torvalds wrote: >> And exactly because I know enough, I would *really* like atomics to be >> well-defined, and have very clear - and *local* - rules about how they >> can be c

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 4:12 AM, Peter Sewell wrote: > > For example, suppose we have, in one compilation unit: > > void f(int ra, int*rb) { > if (ra==42) > *rb=42; > else > *rb=42; > } So this is a great example, and in general I really like your page at: > F

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 8:17 AM, Torvald Riegel wrote: >> >> "Consume operation: no reads in the current thread dependent on the >> value currently loaded can be reordered before this load" > > I can't remember seeing that language in the standard (ie, C or C++). > Where is this from? That's ju

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 10:21 AM, Peter Sewell wrote: > > This is a bit more subtle, because (on ARM and POWER) removing the > dependency and conditional branch is actually in general *not* equivalent > in the hardware, in a concurrent context. So I agree, but I think that's a generic issue with

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 10:23 AM, Peter Sewell wrote: > > interesting list. So are you saying that value-range-analysis and > such-like (I say glibly, without really knowing what "such-like" > refers to here) are fundamentally incompatible with > the kernel code No, it's fine to do things like v

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-18 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 1:21 PM, Torvald Riegel wrote: >> >> So imagine that you have some clever global optimizer that sees that >> the program never ever actually sets the dirty bit at all in any >> thread, and then uses that kind of non-local knowledge to make >> optimization decisions. THAT WO

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-19 Thread Linus Torvalds

On Wed, Feb 19, 2014 at 6:40 AM, Torvald Riegel wrote: > > If all those other threads written in whichever way use the same memory > model and ABI for synchronization (e.g., choice of HW barriers for a > certain memory_order), it doesn't matter whether it's a hardware thread, > microcode, whatever

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-19 Thread Linus Torvalds

On Tue, Feb 18, 2014 at 11:47 AM, Torvald Riegel wrote: > On Tue, 2014-02-18 at 09:44 -0800, Linus Torvalds wrote: >> >> Can you point to it? Because I can find a draft standard, and it sure >> as hell does *not* contain any clarity of the model. It has a *lot* of >>

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-19 Thread Linus Torvalds

On Wed, Feb 19, 2014 at 8:01 PM, Paul E. McKenney wrote: > > The control dependency should order subsequent stores, at least assuming > that "a" and "b" don't start off with identical stores that the compiler > could pull out of the "if" and merge. The same might also be true for ?: > for all I k

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 12:30 AM, Paul E. McKenney wrote: >> >> So lets make this really simple: if you have a consume->cmp->read, is >> the ordering of the two reads guaranteed? > > Not as far as I know. Also, as far as I know, there is no difference > between consume and relaxed in the consume-

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 9:14 AM, Torvald Riegel wrote: >> >> So the clarification is basically to the statement that the "if >> (consume(p)) a" version *would* have an ordering guarantee between the >> read of "p" and "a", but the "consume(p) ? a : b" would *not* have >> such an ordering guarantee

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 9:49 AM, Torvald Riegel wrote: > > Yes, mo_consume is more tricky than mo_acquire. > > However, that has an advantage because you can avoid getting stronger > barriers if you don't need them (ie, you can avoid the "auto-update to > acquire" you seem to have in mind). Oh, I

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 10:11 AM, Paul E. McKenney wrote: > > You really need that "consume" to be "acquire". So I think we now all agree that that is what the standard is saying. And I'm saying that that is wrong, that the standard is badly written, and should be fixed. Because before the stan

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 10:25 AM, Linus Torvalds wrote: > > While in my *sane* model, where you can consume things even if they > then result in control dependencies, there will still eventually be a > "sync" instruction on powerpc (because you really need one between the

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 11:02 AM, Linus Torvalds wrote: > > Again, the way I'd expect a compiler writer to actually *do* this is > to just default to "ac Oops, pressed send by mistake too early. I was almost done: I'd expect a compiler to just default to "acquir

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 10:53 AM, Torvald Riegel wrote: > On Thu, 2014-02-20 at 10:32 -0800, Linus Torvalds wrote: >> On Thu, Feb 20, 2014 at 10:11 AM, Paul E. McKenney >> wrote: >> > >> > You really need that "consume" to be "acquire". &g

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 10:56 AM, Paul E. McKenney wrote: > > The example gcc breakage was something like this: > > i = atomic_load(idx, memory_order_consume); > x = array[0 + i - i]; > > Then gcc optimized this to: > > i = atomic_load(idx, memory_order_consume); >

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-20 Thread Linus Torvalds

On Thu, Feb 20, 2014 at 2:10 PM, Paul E. McKenney wrote: > > Linus, given that you are calling me out for pushing "legalistic and bad" > things, "syntactic bullshit", and playing "little games", I am forced > to conclude that you have never attended any sort of standards-committee > meeting. ;-)

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-21 Thread Linus Torvalds

On Fri, Feb 21, 2014 at 10:25 AM, Peter Sewell wrote: > > If one thinks this is too fragile, then simply using memory_order_acquire > and paying the resulting barrier cost (and perhaps hoping that compilers > will eventually be able to optimise some cases of those barriers to > hardware-level depe

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-21 Thread Linus Torvalds

On Fri, Feb 21, 2014 at 11:16 AM, Linus Torvalds wrote: > > Why would this be any different, especially since it's easy to > understand both for a human and a compiler? Btw, the actual data path may actually be semantically meaningful even at a processor level. For example, let&

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-21 Thread Linus Torvalds

On Fri, Feb 21, 2014 at 11:43 AM, Peter Sewell wrote: > > You have to track dependencies through other assignments, e.g. simple x=y That is all visible in the SSA form. Variable assignment has been converted to some use of the SSA node that generated the value. The use might be a phi node or a ca

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Linus Torvalds

On Sat, Feb 22, 2014 at 10:53 AM, Torvald Riegel wrote: > > Stating that (1) "the standard is wrong" and (2) that you think that > mo_consume semantics are not good is two different things. I do agree. They are two independent things. I think the standard is wrong, because it's overly complex, h

Re: [RFC][PATCH 0/5] arch: atomic rework

2014-02-22 Thread Linus Torvalds

On Sat, Feb 22, 2014 at 4:39 PM, Paul E. McKenney wrote: > > Agreed, by far the most frequent use is "->" to dereference and assignment > to store into a local variable. The other operations where the kernel > expects ordering to be maintained are: > > o Bitwise "&" to strip off low-order b

1 2 >

1 - 100 of 128 matches

Mail list logo