Le 05/02/2016 08:52, Denis Kirjanov a écrit :
On 2/4/16, Christophe Leroy <christophe.le...@c-s.fr> wrote:

Le 04/02/2016 12:37, Denis Kirjanov a écrit :
On 2/4/16, Christophe Leroy <christophe.le...@c-s.fr> wrote:
This simplification helps the compiler. We now have only one test
instead of two, so it reduces the number of branches.

Signed-off-by: Christophe Leroy <christophe.le...@c-s.fr>
---
v2: new
v3: no change
v4: no change
v5: no change

   arch/powerpc/mm/dma-noncoherent.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/dma-noncoherent.c
b/arch/powerpc/mm/dma-noncoherent.c
index 169aba4..2dc74e5 100644
--- a/arch/powerpc/mm/dma-noncoherent.c
+++ b/arch/powerpc/mm/dma-noncoherent.c
@@ -327,7 +327,7 @@ void __dma_sync(void *vaddr, size_t size, int
direction)
                 * invalidate only when cache-line aligned otherwise there is
                 * the potential for discarding uncommitted data from the cache
                 */
-               if ((start & (L1_CACHE_BYTES - 1)) || (size & (L1_CACHE_BYTES - 
1)))
+               if ((start | end) & (L1_CACHE_BYTES - 1))
                        flush_dcache_range(start, end);
                else
                        invalidate_dcache_range(start, end);
The previous version of address cache-line aligned check reads perfectly
fine.
What's the benefit of this micro optimization?
With this optimisation we avoid one unneccessary test and two associated
jumps. Taking into account that __dma_sync() is one of the top ten CPU
consummers, I believe it is worth it:


Yeah, looks better. Did you compile the kernel with default compiler flags?

Thanks!
Yes I did

Christophe

_______________________________________________
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Reply via email to