Hi Ganesh, On Mon, Apr 02, 2018 at 06:01:59PM +0800, Ganesh Mahendran wrote: > 2018-04-02 15:11 GMT+08:00 Minchan Kim <minc...@kernel.org>: > > On Mon, Apr 02, 2018 at 02:46:14PM +0800, Ganesh Mahendran wrote: > >> 2018-04-02 14:34 GMT+08:00 Minchan Kim <minc...@kernel.org>: > >> > On Fri, Mar 30, 2018 at 12:04:07PM +0200, Greg Kroah-Hartman wrote: > >> >> On Fri, Mar 30, 2018 at 10:29:21AM +0900, Minchan Kim wrote: > >> >> > Hi Ganesh, > >> >> > > >> >> > On Fri, Mar 30, 2018 at 09:21:55AM +0800, Ganesh Mahendran wrote: > >> >> > > 2018-03-29 14:54 GMT+08:00 Minchan Kim <minc...@kernel.org>: > >> >> > > > binder_update_page_range needs down_write of mmap_sem because > >> >> > > > vm_insert_page need to change vma->vm_flags to VM_MIXEDMAP unless > >> >> > > > it is set. However, when I profile binder working, it seems > >> >> > > > every binder buffers should be mapped in advance by binder_mmap. > >> >> > > > It means we could set VM_MIXEDMAP in binder_mmap time which is > >> >> > > > already hold a mmap_sem as down_write so binder_update_page_range > >> >> > > > doesn't need to hold a mmap_sem as down_write. > >> >> > > > > >> >> > > > Android suffers from mmap_sem contention so let's reduce mmap_sem > >> >> > > > down_write. > >> >> > > > >> >> > > Hi, Minchan: > >> >> > > > >> >> > > It seems there is performance regression of this patch. > >> >> > > >> >> > You mean "This patch aims for solving performance regression" not > >> >> > "This patch > >> >> > makes performance regression"? > >> >> > > >> >> > > > >> >> > > Do you have some test result of android app launch time or > >> >> > > binderThroughput? > >> >> > > >> >> > Unfortunately, I don't have any number. The goal is to reduce the > >> >> > number of > >> >> > call mmap_sem as write-side lock because it makes priority inversion > >> >> > of threads > >> >> > easily and that's one of clear part I spot that we don't need > >> >> > write-side lock. > >> >> > >> >> Please always run the binderThroughput tests when making binder changes > >> >> (there is a binder test suite in the CTS Android tests), as that ensures > >> >> that you are not causing performance regressions as well as just normal > >> >> bug regressions :) > >> > > >> > Thanks for the information. I didn't notice that such kinds of tests for > >> > binder. I will keep it in mind. > >> > > >> > Today, I have setup the testing for my phone and found testing was very > >> > fluctuating even without my patch. It might be not good with my test > >> > skill. I emulated user's behavior with various touch event. With it, I > >> > open > >> > various apps and play with them several times. Before starting the test, > >> > I did "adb shell stop && adb shell start && echo 3 > > >> > /proc/sys/vm/drop_caches" > >> > > >> > Such 15% noise was very easy to make it. > >> > > >> > Ganesh, How did you measure? What's the stddev? > >> > >> Hi, Minchan: > >> > >> Sorry for the late response, a little busy these days. :) > >> > >> We have our own test tools to measure app launch time, or you can use > >> android systrace to get the app launch time. We tested your V1 patch: > >> https://patchwork.kernel.org/patch/10312057/ > >> and found app lunch time regression. > > > > V1 had a bug with VM_MAYWRITE. Could you confirm it with v5? > > I have finished binder Throughput test. The test result is stable, > there is no performance > regression found both in v1 and v5.
Thanks for the test! Now I'm struggling with setting up BinderThrough test. Binder matainers: If it's really one every binder contributors should do before the sending their patch, couldn't we have them in kernel directory like kselftest? Like me who understand just a part of code, it's hard to download android userspace full code and build/test. > > base patch_v1 patch_v5 > ----------------------------------------------------------- > 91223.4 90560.2 89644.5 > 90520.3 89583.1 89048.2 > 89833.2 90247.6 90091.3 > 90740.2 90276.7 90994.2 > 89703.5 90112.4 89994.6 > 89945.1 89122.8 88937.7 > 89872.8 90357.3 89307.4 > 89913.2 90355.4 89563.8 > 88979 90393.4 90182.8 > 89577.3 90946.8 90441.4 > AVG 90030.8 90195.57 89820.59 Yes, no regression. > > Before the test, I stop the android framework by: > adb shell stop > > > > > Please tell me more detail. What apps are slower compared to old? > > Every apps are slowed with avg 15%? Then, what's the stddev? > > Not all of the apps slowed 15%, The app *avg* launch time slowed 15%. > And We will re-launch the test tomorrow: base, v1,v5. We will get the > test result in two days later. Then I will post all the app launch time > details. I'm also trying to make stable result in my side but it's really hard to get. Please post stddev of each app as well as avg when you finished testing. I really appreicate you. > > > > > The reason I'm asking is as I mentioned, it would be caused by rw_semaphore > > implementation and priority of threads which calls binder operation so I > > guess it would be not deterministic. > > > > When I had an simple experiment, it was very fluctuating as I expected. > > (the testing enviroment might be not good in my side). > > If it's real problem on real practice, better fix is not using write_lock > > of mmap_sem(it's abusing the write-side lock) but should adjust priority, > > I think. What do you think? > > If you want to narrow the range of the problem. We can disable binder priority > inherit, and do not set the priority(currently it is nice -10 or fifo) > of top app in Android AMS. > I think we need to wait for the test result to see whether it really > has performance > regression. Look at up_write. (Let's assume process B is head of wait list of rw_semaphore, and then C, D, E) If the process B is trying to down_write and previous lock holder A is called up_write, the only B could be waked up so there is no contention to get CPU slice. It's the current as-is but if we changes B to try to down_read instead of down_write, B should be competed with other down_read C,D,E in so the chance would be rare to be scheduled. It's really (timing|priority of binder and other threads) problem so I don't understand what you said how we could narrow down the problem with disabling binder priority.