Re: p1020 unstable with 3.2

2011-12-25 Thread Alexander Graf

On 24.12.2011, at 07:53, Benjamin Herrenschmidt wrote:

> On Fri, 2011-12-23 at 17:54 +0100, Alexander Graf wrote:
>> Hi guys,
>> 
>> While trying to test my latest patch queue for ppc kvm, I realized
>> that even though the device trees got updated, the p1020 box still is
>> unstable. The trace below is the one I've seen the most. It only
>> occurs during network I/O which happens a lot on that box, since I'm
>> running it using NFS root.
>> 
>> As for configuration, I use kumar's "merge" branch from today and the
>> p1020rdb.dts device tree provided in that tree.
>> 
>> The last known good configuration I'm aware of is 3.0.
>> 
>> Any ideas what's going wrong here?
> 
> Try SLAB instead of SLUB and let me know. It -could- be a bogon in SLUB
> that should be fixed upstream now but I think did hit 3.2

Yup, things seem a lot more stable with SLAB now :).


Alex

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/5] KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit

2011-12-25 Thread Paul Mackerras
On Fri, Dec 23, 2011 at 02:23:30PM +0100, Alexander Graf wrote:

> So if I read things correctly, this is the only case you're setting
> pages as dirty. What if you have the following:
> 
>   guest adds HTAB entry x
>   guest writes to page mapped by x
>   guest removes HTAB entry x
>   host fetches dirty log

In that case the dirtiness is preserved in the setting of the
KVMPPC_RMAP_CHANGED bit in the rmap entry.  kvm_test_clear_dirty()
returns 1 if that bit is set (and clears it).  Using the rmap entry
for this is convenient because (a) we also use it for saving the
referenced bit when a HTAB entry is removed, and we can transfer both
R and C over in one operation; (b) we need to be able to save away the
C bit in real mode, and we already need to get the real-mode address
of the rmap entry -- if we wanted to save it in a dirty bitmap we'd
have to do an extra translation to get the real-mode address of the
dirty bitmap word; (c) to avoid SMP races, if we were asynchronously
setting bits in the dirty bitmap we'd have to do the double-buffering
thing that x86 does, which seems more complicated than using the rmap
entry (which we already have a lock bit for).

> PS: Always CC kvm@vger for stuff that other might want to review
> (basically all patches)

So why do we have a separate kvm-ppc list then? :)

Paul.
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


Re: [PATCH 4/5] KVM: PPC: Book3s HV: Implement get_dirty_log using hardware changed bit

2011-12-25 Thread Takuya Yoshikawa

(2011/12/26 8:35), Paul Mackerras wrote:

On Fri, Dec 23, 2011 at 02:23:30PM +0100, Alexander Graf wrote:


So if I read things correctly, this is the only case you're setting
pages as dirty. What if you have the following:

   guest adds HTAB entry x
   guest writes to page mapped by x
   guest removes HTAB entry x
   host fetches dirty log


In that case the dirtiness is preserved in the setting of the
KVMPPC_RMAP_CHANGED bit in the rmap entry.  kvm_test_clear_dirty()
returns 1 if that bit is set (and clears it).  Using the rmap entry
for this is convenient because (a) we also use it for saving the
referenced bit when a HTAB entry is removed, and we can transfer both
R and C over in one operation; (b) we need to be able to save away the
C bit in real mode, and we already need to get the real-mode address
of the rmap entry -- if we wanted to save it in a dirty bitmap we'd
have to do an extra translation to get the real-mode address of the
dirty bitmap word; (c) to avoid SMP races, if we were asynchronously
setting bits in the dirty bitmap we'd have to do the double-buffering
thing that x86 does, which seems more complicated than using the rmap
entry (which we already have a lock bit for).


From my x86 dirty logging experience I have some concern about your code:
your code looks slow even when there is no/few dirty pages in the slot.

+   for (i = 0; i < memslot->npages; ++i) {
+   if (kvm_test_clear_dirty(kvm, rmapp))
+   __set_bit_le(i, map);
+   ++rmapp;
+   }

The check is being done for each page and this can be very expensive because
the number of pages is not small.

When we scan the dirty_bitmap 64 pages are checked at once and
the problem is not so significant.

Though I do not know well what kvm-ppc's dirty logging is aiming at, I guess
reporting cleanliness without noticeable delay to the user-space is important.

E.g. for VGA most of the cases are clean.  For live migration, the
chance of seeing complete clean slot is small but almost all cases
are sparse.




PS: Always CC kvm@vger for stuff that other might want to review
(basically all patches)


(Though I sometimes check kvm-ppc on the archives,)

GET_DIRTY_LOG thing will be welcome.

Takuya



So why do we have a separate kvm-ppc list then? :)

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev


[PATCH] fsldma: fix performance degradation by optimizing spinlock use.

2011-12-25 Thread b29237
From: Forrest shi 

dma status check function fsl_tx_status is heavily called in
a tight loop and the desc lock in fsl_tx_status contended by
the dma status update function. this caused the dma performance
degrades much.

this patch releases the lock in the fsl_tx_status function, and
introduce the smp_mb() to avoid possible memory inconsistency.

Signed-off-by: Forrest Shi 
---
 drivers/dma/fsldma.c |6 +-
 1 files changed, 1 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/fsldma.c b/drivers/dma/fsldma.c
index 8a78154..008fb5e 100644
--- a/drivers/dma/fsldma.c
+++ b/drivers/dma/fsldma.c
@@ -986,15 +986,11 @@ static enum dma_status fsl_tx_status(struct dma_chan 
*dchan,
struct fsldma_chan *chan = to_fsl_chan(dchan);
dma_cookie_t last_complete;
dma_cookie_t last_used;
-   unsigned long flags;
-
-   spin_lock_irqsave(&chan->desc_lock, flags);
 
last_complete = chan->completed_cookie;
+   smp_mb();
last_used = dchan->cookie;
 
-   spin_unlock_irqrestore(&chan->desc_lock, flags);
-
dma_set_tx_state(txstate, last_complete, last_used, 0);
return dma_async_is_complete(cookie, last_complete, last_used);
}
-- 
1.7.0.4


___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev