Re: [patch] x86: improved memory barrier implementation

2007-09-30 Thread Nick Piggin
On Sat, Sep 29, 2007 at 09:07:30AM -0700, Linus Torvalds wrote: > > > On Sat, 29 Sep 2007, Nick Piggin wrote: > > > > > > The non-temporal stores should be basically considered to be "IO", not > > > any > > > normal memory operation. > > > > Maybe you're thinking of uncached / WC? Non-tempora

Re: [patch] x86: improved memory barrier implementation

2007-09-29 Thread Dave Jones
On Fri, Sep 28, 2007 at 05:07:19PM +0100, Alan Cox wrote: > > Winchip: can any of these CPUs with ooostores do SMP? If not, then smp_wmb > > can also be a simple barrier on i386 too. > > The IDT Winchip can do SMP apparently. >From the Winchip3 (which was the final winchip) specs.. "The ID

Re: [patch] x86: improved memory barrier implementation

2007-09-29 Thread Linus Torvalds
On Sat, 29 Sep 2007, Nick Piggin wrote: > > > > The non-temporal stores should be basically considered to be "IO", not any > > normal memory operation. > > Maybe you're thinking of uncached / WC? Non-temporal stores to cacheable > RAM apparently can go out of order too, and they are being used

Re: [patch] x86: improved memory barrier implementation

2007-09-29 Thread Nick Piggin
On Fri, Sep 28, 2007 at 06:18:31PM +0100, Alan Cox wrote: > > on the broken ppro stores config option if you just tell me what should > > be there (again, remember that my patch isn't actually changing anything > > already there except for smp_rmb side). > > The PPro needs rmb to ensure a store do

Re: [patch] x86: improved memory barrier implementation

2007-09-29 Thread Nick Piggin
On Fri, Sep 28, 2007 at 09:15:06AM -0700, Linus Torvalds wrote: > > > On Fri, 28 Sep 2007, Alan Cox wrote: > > > > However > > - You've not shown the patch has any performance gain > > It would be nice to see this. Actually, in a userspace test I have (which actually does enough work to trigg

Re: [patch] x86: improved memory barrier implementation

2007-09-28 Thread Alan Cox
> on the broken ppro stores config option if you just tell me what should > be there (again, remember that my patch isn't actually changing anything > already there except for smp_rmb side). The PPro needs rmb to ensure a store doesn't go for a walk on the wild side and pass the read especially wh

Re: [patch] x86: improved memory barrier implementation

2007-09-28 Thread Nick Piggin
On Fri, Sep 28, 2007 at 05:07:19PM +0100, Alan Cox wrote: > > The only alternative is to assume a weak memory model, and add the required > > barriers to spin_unlock -- something that has been explicitly avoided, but > > We have the barriers in spin_unlock already for Pentium Pro and IDT > Winchip

Re: [patch] x86: improved memory barrier implementation

2007-09-28 Thread Linus Torvalds
On Fri, 28 Sep 2007, Alan Cox wrote: > > However > - You've not shown the patch has any performance gain It would be nice to see this. > - You've probably broken Pentium Pro Probably not a big deal, but yeah, we should have that broken-ppro option. > - and for modern processors its still no

Re: [patch] x86: improved memory barrier implementation

2007-09-28 Thread Alan Cox
> The only alternative is to assume a weak memory model, and add the required > barriers to spin_unlock -- something that has been explicitly avoided, but We have the barriers in spin_unlock already for Pentium Pro and IDT Winchip systems. The Winchip explicitly supports out of order store (and wa

[patch] x86: improved memory barrier implementation

2007-09-28 Thread Nick Piggin
According to latest memory ordering specification documents from Intel and AMD, both manufacturers are committed to in-order loads from cacheable memory for the x86 architecture. Hence, smp_rmb() may be a simple barrier. Also according to those documents, and according to existing practice in Linu