Re: [PATCH] staging: android: ion: reduce lock contention latency

2017-03-14 Thread junil0814....@lge.com

Laura Abbott wrote on 2017-03-15 오전 2:10:
> On 03/14/2017 12:51 AM, Junil Lee wrote:
>> Replace list into lock-less list of ion page pool.
>>
>> Measure how mutex lock contention latency on android.
>>
>> 1. the test is done under android 7.0
>> 2. startup many applications circularly
>> 3. find sample in trace log as below
>>
>> cameraserver-625 [004] ...1 1891.952958: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952958: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952966: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952970: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952971: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952982: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952983: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952989: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952989: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952995: mutex_lock_enter: id=0
>> cameraserver-625 [004] ...1 1891.952995: mutex_lock_enter: id=1
>>
>> - id 0 is try to lock, id 1 is locked
>>
>> Figure out how many latency reduction by this patch as below.
>>
>> The test is startup 60 applications circularly (repeat 10cycles)
>> - lock contention count : 3717 -> 93
>>
> 
> We really need to finish up the work to move Ion out of staging before
> looking at performance improvements so please help with that discussion.
> Once that is finished up, we can look at performance improvements again.
> 
> That said, this is removing the lock from the free path. Is the
> contention happening on the free path? Is the system heap deferring
> frees? Does this have a notable affect on anything besides the count
> of contention? None of this is necessarily a deal breaker but I'd
> like a few more details.

Appreciate your comments Laura,

At first, my target uses system heap deferred free.

The background of making this patch is that I tried to shrink ion pool
much at specific time. (e.g. shrink 40MB ion pool instead of lmk kill)
In above test, the problem ion allocation took very long times was
happened and I found it occurred by lock contention.
So, I have tried to reduce contention and use lock-less list as my patch.

As my first email, contention cost is too small about 27usec at once
(the cameraserver waiting time between tries to lock and get lock).
It's too hard to verify meaningful improvement using waiting time.

Therefore, I just add some code as below to check how many contentions
are happened.

static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
{
+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_add_lock);
+
mutex_lock(&pool->mutex);
if (PageHighMem(page)) {
list_add_tail(&page->lru, &pool->high_items);
@@ -101,6 +106,9 @@ struct page *ion_page_pool_alloc(struct
ion_page_pool *pool)

BUG_ON(!pool);

+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_remove_lock);
+
trace_mutex_lock_enter(0);
mutex_lock(&pool->mutex);
trace_mutex_lock_enter(1);
@@ -177,6 +185,9 @@ int ion_page_pool_shrink(struct ion_page_pool *pool,
gfp_t gfp_mask,
for (freed = 0; freed < nr_to_scan; freed++) {
struct page *page;

+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_remove_lock);
+

Then, I found the count of contention.
Before test, add_lock_cnt : 2519, remove_lock: 1202
After, add_lock_cnt: N/A, remove_lock: 93
(The patched ion don't try lock in free path, I ignore add_lock_cnt.)

Thanks,
Junil Lee

> 
> Thanks,
> Laura
> 
>> Signed-off-by: Bongkyu Kim 
>> Signed-off-by: Junil Lee 
>> ---
>> drivers/staging/android/ion/ion_page_pool.c | 52
> ++-
>> drivers/staging/android/ion/ion_priv.h | 8 ++---
>> drivers/staging/android/ion/ion_system_heap.c | 16 -
>> 3 files changed, 40 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion_page_pool.c
> b/drivers/staging/android/ion/ion_page_pool.c
>> index aea89c1..1beb2c8 100644
>> --- a/drivers/staging/android/ion/ion_page_pool.c
>> +++ b/drivers/staging/android/ion/ion_page_pool.c
>> @@ -22,6 +22,7 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> #include "ion_priv.h"
>>
>> static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
>> @@ -44,33 +45,36 @@ static void ion_page_pool_free_pages(struct
> ion_page_pool *pool,
>>
>> static int ion_page_pool_add(struct ion_page_pool *pool, struct page
> *page)
>> {
>> - mutex_lock(&pool->mutex);
>> if (PageHighMem(page)) {
>> - list_add_tail(&page->lru, &pool->high_items);
>> - pool->high_count++;
>> + llist_add((struct llist_node *)&page->lru, &pool->high_items);
>> + atomic_inc(&pool->high_count);
>> } else {
>> - list_add_tail(&page->lru, &pool->low_items);
>> - pool->low_count++;
>> + llist_add((struct llist_node *)&page->lru, &pool->low_items);
>> + atomic

Re: [PATCH] staging: android: ion: reduce lock contention latency

2017-03-16 Thread junil0814....@lge.com

On 03/15/2017 2:10 AM, Laura Abbott wrote:
> On 03/14/2017 12:51 AM, Junil Lee wrote:
>> Replace list into lock-less list of ion page pool.
>>
>> Measure how mutex lock contention latency on android.
>>
>> 1. the test is done under android 7.0
>> 2. startup many applications circularly
>> 3. find sample in trace log as below
>>
>> cameraserver-625 [004] ...1 1891.952958: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952958: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952966: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952970: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952971: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952982: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952983: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952989: mutex_lock_enter: id=0
>> Binder:384_2-417 [005] ...1 1891.952989: mutex_lock_enter: id=1
>> Binder:384_2-417 [005] ...1 1891.952995: mutex_lock_enter: id=0
>> cameraserver-625 [004] ...1 1891.952995: mutex_lock_enter: id=1
>>
>> - id 0 is try to lock, id 1 is locked
>>
>> Figure out how many latency reduction by this patch as below.
>>
>> The test is startup 60 applications circularly (repeat 10cycles)
>> - lock contention count : 3717 -> 93
>>
> 
> We really need to finish up the work to move Ion out of staging before
> looking at performance improvements so please help with that discussion.
> Once that is finished up, we can look at performance improvements again.
> 
> That said, this is removing the lock from the free path. Is the
> contention happening on the free path? Is the system heap deferring
> frees? Does this have a notable affect on anything besides the count
> of contention? None of this is necessarily a deal breaker but I'd
> like a few more details.
> 
> Thanks,
> Laura

Appreciate your comments Laura,

At first, my target uses system heap deferred free.

The background of making this patch is that I tried to shrink ion pool
much at specific time. (e.g. shrink 40MB ion pool instead of lmk kill)
In above test, the problem ion allocation took very long times was
happened and I found it occurred by lock contention.
So, I have tried to reduce contention and use lock-less list as my patch.

As my first email, contention cost is too small about 27usec at once
(the cameraserver waiting time between tries to lock and get lock).
It's too hard to verify meaningful improvement using waiting time.

Therefore, I just add some code as below to check how many contentions
are happened.

static int ion_page_pool_add(struct ion_page_pool *pool, struct page *page)
{
+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_add_lock);
+
mutex_lock(&pool->mutex);
if (PageHighMem(page)) {
list_add_tail(&page->lru, &pool->high_items);
@@ -101,6 +106,9 @@ struct page *ion_page_pool_alloc(struct
ion_page_pool *pool)

BUG_ON(!pool);

+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_remove_lock);
+
trace_mutex_lock_enter(0);
mutex_lock(&pool->mutex);
trace_mutex_lock_enter(1);
@@ -177,6 +185,9 @@ int ion_page_pool_shrink(struct ion_page_pool *pool,
gfp_t gfp_mask,
for (freed = 0; freed < nr_to_scan; freed++) {
struct page *page;

+   if (mutex_is_locked(&pool->mutex))
+   atomic_inc(&mutex_remove_lock);
+

Then, I found the count of contention.
Before test, add_lock_cnt : 2519, remove_lock: 1202
After, add_lock_cnt: N/A, remove_lock: 93
(The patched ion don't try lock in free path, I ignore add_lock_cnt.)

Thanks,
Junil Lee

> 
>> Signed-off-by: Bongkyu Kim 
>> Signed-off-by: Junil Lee 
>> ---
>> drivers/staging/android/ion/ion_page_pool.c | 52
> ++-
>> drivers/staging/android/ion/ion_priv.h | 8 ++---
>> drivers/staging/android/ion/ion_system_heap.c | 16 -
>> 3 files changed, 40 insertions(+), 36 deletions(-)
>>
>> diff --git a/drivers/staging/android/ion/ion_page_pool.c
> b/drivers/staging/android/ion/ion_page_pool.c
>> index aea89c1..1beb2c8 100644
>> --- a/drivers/staging/android/ion/ion_page_pool.c
>> +++ b/drivers/staging/android/ion/ion_page_pool.c
>> @@ -22,6 +22,7 @@
>> #include 
>> #include 
>> #include 
>> +#include 
>> #include "ion_priv.h"
>>
>> static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
>> @@ -44,33 +45,36 @@ static void ion_page_pool_free_pages(struct
> ion_page_pool *pool,
>>
>> static int ion_page_pool_add(struct ion_page_pool *pool, struct page
> *page)
>> {
>> - mutex_lock(&pool->mutex);
>> if (PageHighMem(page)) {
>> - list_add_tail(&page->lru, &pool->high_items);
>> - pool->high_count++;
>> + llist_add((struct llist_node *)&page->lru, &pool->high_items);
>> + atomic_inc(&pool->high_count);
>> } else {
>> - list_add_tail(&page->lru, &pool->low_items);
>> - pool->low_count++;
>> + llist_add((struct llist_node *)&page->lru, &pool->low_items);
>> + atomi