On Monday, 7 June 2021, Aneesh Kumar K.V wrote:
>
> This patchset enables MOVE_PMD/MOVE_PUD support on power. This requires
> the platform to support updating higher-level page tables without
> updating page table entries. This also needs to invalidate the Page Walk
> Cache on architecture suppor
On 12 May 2017 at 13:35, Michael Ellerman wrote:
> Nicholas Piggin writes:
>
> > The single-operand form of tlbie used to be accepted as the second
> > operand (L) being implicitly 0. Newer binutils reject this.
> >
> > Change remaining single-op tlbie instructions to have explicit 0
> > second
On Fri, Jul 09, 2010 at 09:34:16AM +0200, Jens Axboe wrote:
> On 2010-07-09 08:57, divya wrote:
> > On Friday 02 July 2010 12:16 PM, divya wrote:
> >> On Thursday 01 July 2010 11:55 PM, Maciej Rutecki wrote:
> >>> On środa, 30 czerwca 2010 o 13:22:27 divya wrote:
> While running fs_racer test
gt; > [c0010978fc80] [c008deb4] .do_fork+0x190/0x3cc
> > [c0010978fdc0] [c0011ef4] .sys_clone+0x58/0x70
> > [c0010978fe30] [c00087f0] .ppc_clone+0x8/0xc
> > Instruction dump:
> > 419e0010 7fe3fb78 480774cd 6000 801f0014 e93f0008 7800b842 392
ants to support.
>
> Therefore, also take the RCU read lock along with disabling IRQs to ensure
> the RCU grace period does at least cover these lookups.
>
> Signed-off-by: Peter Zijlstra
> Requested-by: Paul E. McKenney
> Cc: Nick Piggin
> Cc: Benjamin Herrenschmidt
>
, rather than
> > simply killing current.
> >
> > Cc: linuxppc-...@ozlabs.org
> > Cc: Benjamin Herrenschmidt
> > Cc: linux-a...@vger.kernel.org
> > Signed-off-by: Nick Piggin
> > ---
> > Index: linux-2.6/arch/powerpc/mm/fault.c
> >
On Wed, Mar 24, 2010 at 06:56:31PM +1100, Benjamin Herrenschmidt wrote:
> Some powerpc code needs to ensure that all previous iounmap/vunmap has
> really been flushed out of the MMU hash table. Without that, various
> hotplug operations may fail when trying to return those pieces to
> the hyperviso
On Wed, Feb 10, 2010 at 10:04:06PM +1100, Anton Blanchard wrote:
>
> For performance reasons we are about to change ISYNC_ON_SMP to sometimes be
> lwsync. Now that the macro name doesn't make sense, change it and
> LWSYNC_ON_SMP
> to better explain what the barriers are doing.
>
> Signed-off-by:
On Wed, Feb 17, 2010 at 08:43:14PM +1100, Anton Blanchard wrote:
>
> Hi Nick,
>
> > Ah, good to see this one come back. I also tested tbench over localhost
> > btw which actually did show some speedup on the G5.
> >
> > BTW. this was the last thing left:
> > http://www.mail-archive.com/linuxpp
On Wed, Feb 17, 2010 at 08:37:14PM +1100, Anton Blanchard wrote:
>
> Hi Nick,
>
> > Cool. How does it go when there are significant amount of instructions
> > between the lock and the unlock? A real(ish) workload, like dbench on
> > ramdisk (which should hit the dcache lock).
>
> Good question,
On Wed, Feb 10, 2010 at 10:10:25PM +1100, Anton Blanchard wrote:
>
> Nick Piggin discovered that lwsync barriers around locks were faster than
> isync
> on 970. That was a long time ago and I completely dropped the ball in testing
> his patches across other ppc64 processors.
>
On Wed, Feb 10, 2010 at 09:57:28PM +1100, Anton Blanchard wrote:
>
> Recent versions of the PowerPC architecture added a hint bit to the larx
> instructions to differentiate between an atomic operation and a lock
> operation:
>
> > 0 Other programs might attempt to modify the word in storage add
On Tue, Jul 21, 2009 at 10:02:26AM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2009-07-20 at 12:38 +0200, Nick Piggin wrote:
> > On Mon, Jul 20, 2009 at 08:00:41PM +1000, Benjamin Herrenschmidt wrote:
> > > On Mon, 2009-07-20 at 10:10 +0200, Nick Piggin wrote:
> > &g
On Mon, Jul 20, 2009 at 07:59:21PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2009-07-20 at 10:05 +0200, Nick Piggin wrote:
> >
> > Unless anybody has other preferences, just send it straight to Linus in
> > the next merge window -- if any conflicts did come up any
On Mon, Jul 20, 2009 at 08:00:41PM +1000, Benjamin Herrenschmidt wrote:
> On Mon, 2009-07-20 at 10:10 +0200, Nick Piggin wrote:
> >
> > Maybe I don't understand your description correctly. The TLB contains
> > PMDs, but you say the HW still logically performs another
On Thu, Jul 16, 2009 at 11:54:15AM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2009-07-15 at 15:56 +0200, Nick Piggin wrote:
> > Interesting arrangement. So are these last level ptes modifieable
> > from userspace or something? If not, I wonder if you could manage
> > the
On Mon, Jul 20, 2009 at 05:11:13PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2009-07-15 at 15:56 +0200, Nick Piggin wrote:
> > > I would like to merge the new support that depends on this in 2.6.32,
> > > so unless there's major objections, I'd like this to go
On Wed, Jul 15, 2009 at 05:49:47PM +1000, Benjamin Herrenschmidt wrote:
> Upcoming paches to support the new 64-bit "BookE" powerpc architecture
> will need to have the virtual address corresponding to PTE page when
> freeing it, due to the way the HW table walker works.
>
> Basically, the TLB can
On Fri, Jun 12, 2009 at 01:38:50PM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >>I was able to boot yesterday's next (20090611) on this machine. Not sure
> >>
> >
> >Still with SLQB? With debug options turned on?
> >
> Ah .. spoke too soo
On Fri, Jun 12, 2009 at 11:14:10AM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >I can't really work it out. It seems to be the kmem_cache_cache which has
> >a problem, but there have already been lots of caches created and even
> >this samw cache_node already use
On Mon, Jun 08, 2009 at 05:42:14PM +0530, Sachin Sant wrote:
> Pekka J Enberg wrote:
> >Hi Sachin,
> __slab_alloc_page: nid=2, cache_node=c000de01ba00,
> cache_list=c000de01ba00
> __slab_alloc_page: nid=2, cache_node=c000de01bd00,
> cache_list=c000de01bd00
> __slab_alloc_page: nid
On Tue, May 12, 2009 at 04:52:45PM +1000, Stephen Rothwell wrote:
> Hi Nick,
>
> On Tue, 12 May 2009 16:03:52 +1000 Stephen Rothwell
> wrote:
> >
> > This is what I have been getting for the last few days:
>
> bisected into the net changes, I will follow up there, sorry.
No problem. Phew, your
On Tue, May 12, 2009 at 03:56:13PM +1000, Stephen Rothwell wrote:
> Hi Nick,
>
> On Tue, 12 May 2009 06:57:16 +0200 Nick Piggin wrote:
> >
> > Hmm, I think (hope) your problems were fixed with the recent memory
> > coruption bug fix for SLQB. (if not, let me know)
On Mon, May 11, 2009 at 06:21:35AM -0600, Matthew Wilcox wrote:
> On Mon, May 11, 2009 at 05:34:07PM +0530, Sachin Sant wrote:
> > Matthew Wilcox wrote:
> >> On Mon, May 11, 2009 at 05:16:10PM +0530, Sachin Sant wrote:
> >>
> >>> Today's Next tree failed to boot on a Power6 box with following BU
On Fri, May 01, 2009 at 12:00:33AM +1000, Stephen Rothwell wrote:
> Hi Nick,
>
> On Thu, 30 Apr 2009 15:05:42 +0200 Nick Piggin wrote:
> >
> > Hmm, this might do it. The following code now passes some stress testing
> > in a userspace harness wheras before it did n
On Thu, Apr 30, 2009 at 02:20:29PM +0300, Pekka Enberg wrote:
> On Thu, 2009-04-30 at 13:18 +0200, Nick Piggin wrote:
> > OK thanks. So I think we have 2 problems. One with MAX_ORDER <= 9
> > that is fixed by the previous patch, and another which is probably
> > due to ha
On Thu, Apr 30, 2009 at 02:20:29PM +0300, Pekka Enberg wrote:
> On Thu, 2009-04-30 at 13:18 +0200, Nick Piggin wrote:
> > OK thanks. So I think we have 2 problems. One with MAX_ORDER <= 9
> > that is fixed by the previous patch, and another which is probably
> > due to ha
On Thu, Apr 30, 2009 at 09:00:04PM +1000, Stephen Rothwell wrote:
> Hi Pekka, Nick,
>
> On Thu, 30 Apr 2009 13:38:04 +0300 Pekka Enberg
> wrote:
> >
> > Stephen, does this patch fix all the boot problems for you as well?
>
> Unfortunately not, I am still getting this:
>
> Memory: 1967708k/2097
On Thu, Apr 30, 2009 at 03:17:12PM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >Hmm, forget that. Actually my last patch had a silly mistake because I
> >forgot MAX_ORDER shift is applied to PAGE_SIZE, rather than 1. So
> >kmalloc(PAGE_SIZE) was failing as too large.
>
On Thu, Apr 30, 2009 at 11:06:36AM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >Well kmalloc is failing. It should not be though, even if the
> >current node is offline, it should be able to fall back to other
> >nodes. Stephen's trace indicates the same thing
On Thu, Apr 30, 2009 at 11:06:36AM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >Well kmalloc is failing. It should not be though, even if the
> >current node is offline, it should be able to fall back to other
> >nodes. Stephen's trace indicates the same thing
On Wed, Apr 29, 2009 at 09:56:19PM +0530, Sachin Sant wrote:
> Nick Piggin wrote:
> >Does this help?
> >---
> With the patch the machine boots past the failure point, but panics
> immediately with the following trace...
OK good, that solves one problem.
> Unable to hand
0x1c/0x44
OK I think the problem is that with 64K pages you get a default MAX_ORDER
of 9, and slqb is trying to create slabs which exceed that size..
Does this help?
---
SLQB: fix slab calculation
SLQB didn't consider MAX_ORDER when defining which sizes of kmalloc
slabs to create. It panics at
On Tue, Apr 28, 2009 at 02:22:06PM +0300, Pekka Enberg wrote:
> Nick,
>
> Here's another one. I think we need to either fix these rather quickly
> or make SLUB the defaut for linux-next again so we don't interfere
> with other testing.
Yeah, I'm working on it. Let me either give you a fix or a pa
On Wed, Mar 04, 2009 at 03:04:11PM +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2009-02-19 at 18:21 +0100, Nick Piggin wrote:
> > OK, here is this patch again. You didn't think I'd let a 2% performance
> > improvement be forgotten? :)
> >
> > Anyway,
On Wed, Mar 04, 2009 at 03:03:15PM +1100, Benjamin Herrenschmidt wrote:
> Allright, sorry for the delay, I had those stored into my "need more
> than half a brain cell for review" list and only got to them today :-)
No problem :)
> On Thu, 2009-02-19 at 18:12 +0100
OK, here is this patch again. You didn't think I'd let a 2% performance
improvement be forgotten? :)
Anyway, patch won't work well on architecture without lwsync, but I won't
bother fixing that kind of thing and making it merge worthy until you
guys say something positive about it.
20 runs of tbe
Using lwsync, isync sequence in a microbenchmark is 5 times faster on my G5 than
using sync for smp_mb. Although it takes more instructions.
Running tbench with 4 clients on my 4 core G5 (20 times) gives the
following:
unpatched AVG=920.33 STD=2.36
patched AVG=921.27 STD=2.77
So not a big imp
deally the generic code would be able to measure it in case the platform
does not provide it.
But this simple patch at least makes it throttle again.
Signed-off-by: Nick Piggin
---
Index: linux-2.6/arch/powerpc/platforms/powermac/cp
On Tue, Feb 17, 2009 at 03:55:40AM +0300, Alexey Dobriyan wrote:
> FYI, on powerpc-64-smp-n-debug-n:
>
> mm/slqb.c: In function '__slab_free':
> mm/slqb.c:1648: error: implicit declaration of function 'slab_free_to_remote'
> mm/slqb.c: In function 'kmem_cache_open':
> mm/slqb.c:2174: error: implic
On Tue, Feb 10, 2009 at 01:53:51PM +0200, Pekka Enberg wrote:
> On Tue, Feb 10, 2009 at 11:54 AM, Sachin P. Sant wrote:
> > Sachin P. Sant wrote:
> >>
> >> Hi Stephen,
> >>
> >> Todays next randconfig build on powerpc fails with
> >>
> >> CC mm/slqb.o
> >> mm/slqb.c: In function __slab_free:
On Wednesday 04 February 2009 16:13:29 Andrew Morton wrote:
> On Wed, 04 Feb 2009 12:50:48 +1100 Benjamin Herrenschmidt
> > Do the generic hugetlbfs code provides such an API ? If not, we may need
> > to add one.
>
> I think it's something like
>
> huge_page_size(page_hstate(page))
That wou
On Friday 12 December 2008 13:47, Andrew Morton wrote:
> On Fri, 12 Dec 2008 12:31:33 +1000 Nick Piggin
wrote:
> > On Friday 12 December 2008 07:43, Andrew Morton wrote:
> > > On Thu, 11 Dec 2008 20:28:00 +
> > >
> > > > Do they actually cross the pag
On Friday 12 December 2008 07:43, Andrew Morton wrote:
> On Thu, 11 Dec 2008 20:28:00 +
> > Do they actually cross the page boundaries?
>
> Some flavours of slab have at times done an order-1 allocation for
> objects which would fit into an order-0 page (etc) if it looks like
> that will be b
On Tuesday 18 November 2008 13:08, Linus Torvalds wrote:
> On Tue, 18 Nov 2008, Paul Mackerras wrote:
> > Also, you didn't respond to my comments about the purely software
> > benefits of a larger page size.
>
> I realize that there are benefits. It's just that the downsides tend to
> swamp the ups
On Tuesday 18 November 2008 09:53, Paul Mackerras wrote:
> I'd love to be able to use a 4k base page size if I could still get
> the reduction in page faults and the expanded TLB reach that we get
> now with 64k pages. If we could allocate the page cache for large
> files with order-4 allocations
Implement a more optimal mutex fastpath for powerpc, making use of acquire
and release barrier semantics. This takes the mutex lock+unlock benchmark
from 203 to 173 cycles on a G5.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/include/asm/m
-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/include/asm/system.h
===
--- linux-2.6.orig/arch/powerpc/include/asm/system.h2008-11-12
12:28:57.0 +1100
+++ linux-2.6/arch/powerpc/inclu
C,
causing smp_wmb to revert back to eieio for all CPUs. Restore the behaviour
intorduced in 74f0609526afddd88bef40b651da24f3167b10b2.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/inc
On Thu, Nov 06, 2008 at 03:09:08PM +1100, Paul Mackerras wrote:
> Nick Piggin writes:
>
> > On Sun, Oct 12, 2008 at 07:47:32AM +0200, Nick Piggin wrote:
> > >
> > > Implement a more optimal mutex fastpath for powerpc, making use of acquire
> > > and re
On Mon, Nov 03, 2008 at 04:32:22PM +1100, Paul Mackerras wrote:
> Nick Piggin writes:
>
> > This is an interesting one for me. AFAIKS it is possible to use lwsync for
> > a full barrier after a successful ll/sc operation, right? (or stop me here
> > if I'm wrong
On Sat, Nov 01, 2008 at 11:47:58AM -0500, Kumar Gala wrote:
>
> On Nov 1, 2008, at 7:33 AM, Nick Piggin wrote:
>
> >A previous change removed __SUBARCH_HAS_LWSYNC define, and replaced it
> >with __powerpc64__. smp_wmb() seems to be the last place not updated.
>
> Uugh
Hi guys,
This is an interesting one for me. AFAIKS it is possible to use lwsync for
a full barrier after a successful ll/sc operation, right? (or stop me here
if I'm wrong).
Anyway, I was interested in exploring this. Unfortunately my G5 might not
be very indicative of more modern, and future dev
smp_rmb can be lwsync if possible. Clarify the comment.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/include/asm/system.h
===
--- linux-2.6.orig/arch/powerpc/include/asm/system.h2008-11-
A previous change removed __SUBARCH_HAS_LWSYNC define, and replaced it
with __powerpc64__. smp_wmb() seems to be the last place not updated.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/include/asm/sy
On Thu, Oct 23, 2008 at 03:43:58PM +1100, Benjamin Herrenschmidt wrote:
> On Wed, 2008-10-22 at 17:59 +0200, Ingo Molnar wrote:
> > * Nick Piggin <[EMAIL PROTECTED]> wrote:
> >
> > > Speed up generic mutex implementations.
> > >
> > > - atomic oper
On Mon, Oct 13, 2008 at 11:20:20AM -0500, Scott Wood wrote:
> On Mon, Oct 13, 2008 at 11:15:47AM -0500, Scott Wood wrote:
> > On Sun, Oct 12, 2008 at 07:47:32AM +0200, Nick Piggin wrote:
> > > +static inline int __mutex_cmpxchg_lock(atomic_t *v, int old, int new)
>
On Sun, Oct 12, 2008 at 07:47:32AM +0200, Nick Piggin wrote:
>
> Implement a more optimal mutex fastpath for powerpc, making use of acquire
> and release barrier semantics. This takes the mutex lock+unlock benchmark
> from 203 to 173 cycles on a G5.
>
>
Implement a more optimal mutex fastpath for powerpc, making use of acquire
and release barrier semantics. This takes the mutex lock+unlock benchmark
from 203 to 173 cycles on a G5.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/arch/powerpc/include/asm/m
nlock test from 590 cycles
to 203 cycles on a ppc970 system.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/include/asm-generic/mutex-dec.h
===
--- linux-2.6.orig/include/asm-generic/mutex-dec.h
+++ linux
On Wednesday 20 August 2008 07:08, Steven Rostedt wrote:
> On Tue, 19 Aug 2008, Mathieu Desnoyers wrote:
> > Ok, there are two cases where it's ok :
> >
> > 1 - in stop_machine, considering we are not touching code executed in
> > NMI handlers.
> > 2 - when using my replace_instruction_safe() which
On Thu, Jul 31, 2008 at 01:48:31PM -0500, Kumar Gala wrote:
> Implement _PAGE_SPECIAL and pte_special() for 32-bit powerpc. This bit will
> be used by the fast get_user_pages() to differenciate PTEs that correspond
> to a valid struct page from special mappings that don't such as IO mappings
> obta
On Thursday 31 July 2008 21:27, Mel Gorman wrote:
> On (31/07/08 16:26), Nick Piggin didst pronounce:
> > I imagine it should be, unless you're using a CPU with seperate TLBs for
> > small and huge pages, and your large data set is mapped with huge pages,
> > in which ca
On Thursday 31 July 2008 16:14, Andrew Morton wrote:
> On Thu, 31 Jul 2008 16:04:14 +1000 Nick Piggin <[EMAIL PROTECTED]>
wrote:
> > > Do we expect that this change will be replicated in other
> > > memory-intensive apps? (I do).
> >
> > Such as what? It
On Thursday 31 July 2008 03:34, Andrew Morton wrote:
> On Wed, 30 Jul 2008 18:23:18 +0100 Mel Gorman <[EMAIL PROTECTED]> wrote:
> > On (30/07/08 01:43), Andrew Morton didst pronounce:
> > > On Mon, 28 Jul 2008 12:17:10 -0700 Eric Munson <[EMAIL PROTECTED]>
wrote:
> > > > Certain workloads benefit
On Wed, Jul 30, 2008 at 07:33:26AM -0500, Kumar Gala wrote:
>
> On Jul 29, 2008, at 10:37 PM, Benjamin Herrenschmidt wrote:
>
> >From: Nick Piggin <[EMAIL PROTECTED]>
> >
> >Implement lockless get_user_pages_fast for powerpc. Page table
> >existence
>
On Wed, Jul 30, 2008 at 03:08:40PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2008-07-30 at 15:06 +1000, Michael Ellerman wrote:
> > > +
> > > +/*
> > > + * The performance critical leaf functions are made noinline otherwise
> > > gcc
> > > + * inlines everything into a single function which r
On Thursday 24 July 2008 20:50, Sebastien Dugue wrote:
> From: Sebastien Dugue <[EMAIL PROTECTED]>
> Date: Tue, 22 Jul 2008 11:56:41 +0200
> Subject: [PATCH][RT] powerpc - Make the irq reverse mapping radix tree
> lockless
>
> The radix tree used by interrupt controllers for their irq reverse
> m
tence guarantee on them.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
Signed-off-by: Dave Kleikamp <[EMAIL PROTECTED]>
---
arch/powerpc/Kconfig|3
arch/powerpc/mm/Makefile|2
arch/powerpc/mm/gup.c | 245
This can be folded into powerpc-implement-pte_special.patch
--
Ben has now freed up a pte bit on 64k pages. Use it for special pte bit.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/include/asm-powerpc/pgtable
On Tue, May 13, 2008 at 12:25:27PM -0500, Jon Tollefson wrote:
> Instead of using the variable mmu_huge_psize to keep track of the huge
> page size we use an array of MMU_PAGE_* values. For each supported
> huge page size we need to know the hugepte_shift value and have a
> pgtable_cache. The hst
On Thursday 12 June 2008 22:14, Paul Mackerras wrote:
> Nick Piggin writes:
> > /* turn off LED */
> > val64 = readq(&bar0->adapter_control);
> > val64 = val64 &(~ADAPTER_LED_ON);
> >
On Thursday 12 June 2008 02:07, Jesse Barnes wrote:
> On Tuesday, June 10, 2008 8:29 pm Nick Piggin wrote:
> > You mention strong ordering WRT spin_unlock, which suggests that
> > you would prefer to take option #2 (the current powerpc one): io/io
> > is ordered and i
On Wednesday 11 June 2008 15:35, Nick Piggin wrote:
> On Wednesday 11 June 2008 15:13, Paul Mackerras wrote:
> > Nick Piggin writes:
> > > > I just wish we had even one actual example of things going wrong with
> > > > the current rules we have on powerpc to motiva
On Wednesday 11 June 2008 15:13, Paul Mackerras wrote:
> Nick Piggin writes:
> > > I just wish we had even one actual example of things going wrong with
> > > the current rules we have on powerpc to motivate changing to this
> > > model.
> >
> > ~/us
On Wednesday 11 June 2008 14:18, Paul Mackerras wrote:
> Nick Piggin writes:
> > OK, I'm sitll not quite sure where this has ended up. I guess you are
> > happy with x86 semantics as they are now. That is, all IO accesses are
> > strongly ordered WRT one another and W
On Wednesday 11 June 2008 13:40, Benjamin Herrenschmidt wrote:
> On Wed, 2008-06-11 at 13:29 +1000, Nick Piggin wrote:
> > Exactly, yes. I guess everybody has had good intentions here, but
> > as noticed, what is lacking is coordination and documentation.
> >
> > You
On Wednesday 11 June 2008 05:19, Jesse Barnes wrote:
> On Tuesday, June 10, 2008 12:05 pm Roland Dreier wrote:
> > > me too. That's the whole basis for readX_relaxed() and its cohorts:
> > > we make our weirdest machines (like altix) conform to the x86 norm.
> > > Then where it really kills us
On Wednesday 04 June 2008 05:07, Linus Torvalds wrote:
> On Tue, 3 Jun 2008, Trent Piepho wrote:
> > On Tue, 3 Jun 2008, Linus Torvalds wrote:
> > > On Tue, 3 Jun 2008, Nick Piggin wrote:
> > > > Linus: on x86, memory operations to wc and wc+ memory are not ordere
On Wednesday 04 June 2008 07:44, Trent Piepho wrote:
> On Tue, 3 Jun 2008, Matthew Wilcox wrote:
> > I don't understand why you keep talking about DMA. Are you talking
> > about ordering between readX() and DMA? PCI proides those guarantees.
>
> I guess you haven't been reading the whole thread.
On Wednesday 04 June 2008 00:47, Linus Torvalds wrote:
> On Tue, 3 Jun 2008, Nick Piggin wrote:
> > Linus: on x86, memory operations to wc and wc+ memory are not ordered
> > with one another, or operations to other memory types (ie. load/load
> > and store/store reordering
On Wednesday 04 June 2008 05:07, Linus Torvalds wrote:
> On Tue, 3 Jun 2008, Trent Piepho wrote:
> > On Tue, 3 Jun 2008, Linus Torvalds wrote:
> > > On Tue, 3 Jun 2008, Nick Piggin wrote:
> > > > Linus: on x86, memory operations to wc and wc+ memory are not ordere
ho wrote:
> >>>> On Tue, 3 Jun 2008, Linus Torvalds wrote:
> >>>>> On Tue, 3 Jun 2008, Nick Piggin wrote:
> >>>>>> Linus: on x86, memory operations to wc and wc+ memory are not
> >>>>>> ordered with one another, or operation
On Tuesday 03 June 2008 18:15, Jeremy Higdon wrote:
> On Tue, Jun 03, 2008 at 02:33:11PM +1000, Nick Piggin wrote:
> > On Monday 02 June 2008 19:56, Jes Sorensen wrote:
> > > Would we be able to use Ben's trick of setting a per cpu flag in
> > > writel() then
On Tuesday 03 June 2008 16:53, Paul Mackerras wrote:
> Nick Piggin writes:
> > So your readl can pass an earlier cacheable store or earlier writel?
>
> No. It's quite gross at the moment, it has a sync before the access
> (i.e. a full mb()) and a twi; isync sequence after t
On Tuesday 03 June 2008 14:32, Benjamin Herrenschmidt wrote:
> > This whole thread also ties in with my posts about mmiowb (which IMO
> > should go away).
> >
> > readl/writel: strongly ordered wrt one another and other stores
> >to cacheable RAM, byteswapping
> > __readl/__writel:
On Monday 02 June 2008 19:56, Jes Sorensen wrote:
> Jeremy Higdon wrote:
> > We don't actually have that problem on the Altix. All writes issued
> > by CPU X will be ordered with respect to each other. But writes by
> > CPU X and CPU Y will not be, unless an mmiowb() is done by the
> > original C
On Monday 02 June 2008 17:24, Russell King wrote:
> On Tue, May 27, 2008 at 02:55:56PM -0700, Linus Torvalds wrote:
> > On Wed, 28 May 2008, Benjamin Herrenschmidt wrote:
> > > A problem with __raw_ though is that they -also- don't do byteswap,
> >
> > Well, that's why there is __readl() and __raw_
On Fri, May 23, 2008 at 04:40:21PM +1000, Paul Mackerras wrote:
> Nick Piggin writes:
>
> > Anyway, even if there were zero, then the point is still that you
> > implement that API, so you should either strongly order your
> > __raw_ and _relaxed then you can weaken
On Tue, May 13, 2008 at 12:19:36PM -0500, Jon Tollefson wrote:
> Allow alloc_bm_huge_page() to be overridden by architectures that can't
> always use bootmem. This requires huge_boot_pages to be available for
> use by this function. The 16G pages on ppc64 have to be reserved prior
> to boot-time. T
On Fri, May 23, 2008 at 02:53:21PM +1000, Paul Mackerras wrote:
> Nick Piggin writes:
>
> > There don't seem to actually be read*_relaxed calls that also use rmb
> > in the same file (although there is no reason why they might not appear).
> > But I must be thinking
On Fri, May 23, 2008 at 12:14:41PM +1000, Paul Mackerras wrote:
> Nick Piggin writes:
>
> > More than one device driver does raw/relaxed io accessors and expects the
> > *mb functions to order them.
>
> Can you point us at an example?
Uh, I might be getting confused b
On Wed, May 21, 2008 at 10:12:03PM +0200, Segher Boessenkool wrote:
> >>From memory, I measured lwsync is 5 times faster than eieio on
> >a dual G5. This was on a simple microbenchmark that made use of
> >smp_wmb for store ordering, but it did not involve any IO access
> >(which presumably would di
On Wed, May 21, 2008 at 11:43:00AM -0400, Benjamin Herrenschmidt wrote:
>
> On Wed, 2008-05-21 at 17:34 +0200, Nick Piggin wrote:
> > On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote:
> > >
> > > On Wed, 2008-05-21 at 16:12 +0200, Nick Piggi
On Wed, May 21, 2008 at 11:43:00AM -0400, Benjamin Herrenschmidt wrote:
>
> On Wed, 2008-05-21 at 17:34 +0200, Nick Piggin wrote:
> > On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote:
> > >
> > > On Wed, 2008-05-21 at 16:12 +0200, Nick Piggi
On Wed, May 21, 2008 at 11:26:32AM -0400, Benjamin Herrenschmidt wrote:
>
> On Wed, 2008-05-21 at 16:12 +0200, Nick Piggin wrote:
> > lwsync is the recommended method of store/store ordering on caching enabled
> > memory. For those subarchs which have lwsync, use it rat
On Wed, May 21, 2008 at 11:27:03AM -0400, Benjamin Herrenschmidt wrote:
>
> On Wed, 2008-05-21 at 16:10 +0200, Nick Piggin wrote:
> > Hi,
> >
> > I'm sure I've sent these patches before, but I can't remember why they
> > weren't merged. They
lwsync is the recommended method of store/store ordering on caching enabled
memory. For those subarchs which have lwsync, use it rather than eieio for
smp_wmb.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/include/asm-powerpc/sy
pears
to be the only barrier which fits the bill.
Signed-off-by: Nick Piggin <[EMAIL PROTECTED]>
---
Index: linux-2.6/include/asm-powerpc/system.h
===
--- linux-2.6.orig/include/asm-powerpc/system.h
+++ linux-2.6/include/asm-power
n let the subsequent non trivial
patches filter up in their own time. I don't know, just a heads up.
On Wed, May 14, 2008 at 04:12:54PM -0700, Andrew Morton wrote:
> From: Nick Piggin <[EMAIL PROTECTED]>
>
> spufs: convert nopfn to fault
>
> Signed-off-by: Nick Piggin
1 - 100 of 133 matches
Mail list logo