Kishore, do you have news?

Il ven 14 lug 2017, 09:05 Enrico Olivelli <[email protected]> ha scritto:

> At the meeting we told the Kishore will perform some benchmarks on his
> side.
> He will take a look at my code, and we are going to share the results.
> Maybe it will be possible to share the results of benchmarks done from
> Kishore at Salesforce too.
>
> The primary goal is to understand the differences between 4.4 and 4.5, for
> instance if we there is a need to change JVM/BK configuration in order to
> make 4.5 perform as 4.4.
>
> @Sijie I hope I have answered your questions.
>
>
> -- Enrico
>
>
> 2017-07-13 9:29 GMT+02:00 Enrico Olivelli <[email protected]>:
>
>>
>>
>> 2017-07-13 4:11 GMT+02:00 Sijie Guo <[email protected]>:
>>
>>> On Wed, Jul 12, 2017 at 10:35 PM, Enrico Olivelli <[email protected]>
>>> wrote:
>>>
>>> > Sijie, JV, just a recap my point of view:
>>> > - considering latency = "time for asynchAddEntry to complete"
>>> > - there is a some difference from 4.4 and 4.5 in the usage of memory,
>>> but
>>> > no so clear
>>> > - the type of GC (parallel vs G1) does not impact very much but with
>>> G1 you
>>> > achieve best latency
>>> >
>>>
>>> Did you try CMS for v3 protocol?
>>>
>>
>> BK 4.5 + v3 procotol + CMS + throttle = 0
>>
>> #0 Total wall clock time: 309.0 ms, total callbacks time: 249753 ms,
>> entry size 0,034 MB -> 249,75 ms per entry (latency),0,3 ms per entry
>> (throughput) 110,6 MB/s throughput
>> #1 Total wall clock time: 192.0 ms, total callbacks time: 156268 ms,
>> entry size 0,034 MB -> 156,27 ms per entry (latency),0,2 ms per entry
>> (throughput) 178,0 MB/s throughput
>> #2 Total wall clock time: 129.0 ms, total callbacks time: 95279 ms, entry
>> size 0,034 MB -> 95,28 ms per entry (latency),0,1 ms per entry (throughput)
>> 265,0 MB/s throughput
>> #3 Total wall clock time: 125.0 ms, total callbacks time: 94661 ms, entry
>> size 0,034 MB -> 94,66 ms per entry (latency),0,1 ms per entry (throughput)
>> 273,4 MB/s throughput
>> #4 Total wall clock time: 112.0 ms, total callbacks time: 83015 ms, entry
>> size 0,034 MB -> 83,02 ms per entry (latency),0,1 ms per entry (throughput)
>> 305,2 MB/s throughput
>> #5 Total wall clock time: 111.0 ms, total callbacks time: 82839 ms, entry
>> size 0,034 MB -> 82,84 ms per entry (latency),0,1 ms per entry (throughput)
>> 307,9 MB/s throughput
>> #6 Total wall clock time: 114.0 ms, total callbacks time: 82562 ms, entry
>> size 0,034 MB -> 82,56 ms per entry (latency),0,1 ms per entry (throughput)
>> 299,8 MB/s throughput
>> #7 Total wall clock time: 104.0 ms, total callbacks time: 74356 ms, entry
>> size 0,034 MB -> 74,36 ms per entry (latency),0,1 ms per entry (throughput)
>> 328,7 MB/s throughput
>> #8 Total wall clock time: 110.0 ms, total callbacks time: 83488 ms, entry
>> size 0,034 MB -> 83,49 ms per entry (latency),0,1 ms per entry (throughput)
>> 310,7 MB/s throughput
>> #9 Total wall clock time: 102.0 ms, total callbacks time: 68787 ms, entry
>> size 0,034 MB -> 68,79 ms per entry (latency),0,1 ms per entry (throughput)
>> 335,1 MB/s throughput
>>
>> gc.log
>> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
>> 20080428 (Red Hat 4.3.0-8)
>> Memory: 4k page, physical 32870636k(3276148k free), swap
>> 15622140k(15460604k free)
>> CommandLine flags: -XX:+CMSIncrementalMode -XX:InitialHeapSize=2147483648
>> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 <(214)%20748-3648>
>> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 -XX:NewSize=697933824
>> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC
>> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers
>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> 0,630: [GC (Allocation Failure)  545344K->8508K(2029056K), 0,0065802 secs]
>> 1,196: [GC (Allocation Failure)  553852K->117940K(2029056K), 0,0385576
>> secs]
>> 1,235: [GC (CMS Initial Mark)  119486K(2029056K), 0,0022661 secs]
>> 1,544: [GC (Allocation Failure)  663284K->207833K(2029056K), 0,0330904
>> secs]
>> 1,731: [GC (CMS Final Remark)  560308K(2029056K), 0,0105624 secs]
>> 1,837: [GC (Allocation Failure)  753177K->214147K(2029056K), 0,0105574
>> secs]
>> 2,115: [GC (Allocation Failure)  759491K->241263K(2029056K), 0,0166134
>> secs]
>>
>>
>> BK 4.5 + v2 protocol + CMS + throttle = 0
>>
>> #0 Total wall clock time: 198.0 ms, total callbacks time: 169954 ms,
>> entry size 0,034 MB -> 169,95 ms per entry (latency),0,2 ms per entry
>> (throughput) 172,6 MB/s throughput
>> #1 Total wall clock time: 106.0 ms, total callbacks time: 70885 ms, entry
>> size 0,034 MB -> 70,89 ms per entry (latency),0,1 ms per entry (throughput)
>> 322,4 MB/s throughput
>> #2 Total wall clock time: 109.0 ms, total callbacks time: 71848 ms, entry
>> size 0,034 MB -> 71,85 ms per entry (latency),0,1 ms per entry (throughput)
>> 313,6 MB/s throughput
>> #3 Total wall clock time: 97.0 ms, total callbacks time: 73583 ms, entry
>> size 0,034 MB -> 73,58 ms per entry (latency),0,1 ms per entry (throughput)
>> 352,4 MB/s throughput
>> #4 Total wall clock time: 91.0 ms, total callbacks time: 61293 ms, entry
>> size 0,034 MB -> 61,29 ms per entry (latency),0,1 ms per entry (throughput)
>> 375,6 MB/s throughput
>> #5 Total wall clock time: 98.0 ms, total callbacks time: 73006 ms, entry
>> size 0,034 MB -> 73,01 ms per entry (latency),0,1 ms per entry (throughput)
>> 348,8 MB/s throughput
>> #6 Total wall clock time: 104.0 ms, total callbacks time: 75696 ms, entry
>> size 0,034 MB -> 75,70 ms per entry (latency),0,1 ms per entry (throughput)
>> 328,7 MB/s throughput
>> #7 Total wall clock time: 103.0 ms, total callbacks time: 79260 ms, entry
>> size 0,034 MB -> 79,26 ms per entry (latency),0,1 ms per entry (throughput)
>> 331,8 MB/s throughput
>> #8 Total wall clock time: 92.0 ms, total callbacks time: 62232 ms, entry
>> size 0,034 MB -> 62,23 ms per entry (latency),0,1 ms per entry (throughput)
>> 371,5 MB/s throughput
>> #9 Total wall clock time: 98.0 ms, total callbacks time: 65359 ms, entry
>> size 0,034 MB -> 65,36 ms per entry (latency),0,1 ms per entry (throughput)
>> 348,8 MB/s throughput
>>
>>
>> gc.log
>> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
>> 20080428 (Red Hat 4.3.0-8)
>> Memory: 4k page, physical 32870636k(3154408k free), swap
>> 15622140k(15460604k free)
>> CommandLine flags: -XX:+CMSIncrementalMode -XX:InitialHeapSize=2147483648
>> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 <(214)%20748-3648>
>> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 -XX:NewSize=697933824
>> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC
>> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers
>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> 0,554: [GC (Allocation Failure)  545344K->8624K(2029056K), 0,0060763 secs]
>> 1,478: [GC (Allocation Failure)  552028K->79872K(2029056K), 0,0252399
>> secs]
>> 1,504: [GC (CMS Initial Mark)  84227K(2029056K), 0,0017813 secs]
>>
>>
>> BK 4.4 + v3protocol + CMS + throttle = 0
>>
>> #0 Total wall clock time: 214.0 ms, total callbacks time: 207369 ms,
>> entry size 0,034 MB -> 207,37 ms per entry (latency),0,2 ms per entry
>> (throughput) 159,7 MB/s throughput
>> #1 Total wall clock time: 153.0 ms, total callbacks time: 140661 ms,
>> entry size 0,034 MB -> 140,66 ms per entry (latency),0,2 ms per entry
>> (throughput) 223,4 MB/s throughput
>> #2 Total wall clock time: 159.0 ms, total callbacks time: 145131 ms,
>> entry size 0,034 MB -> 145,13 ms per entry (latency),0,2 ms per entry
>> (throughput) 215,0 MB/s throughput
>> #3 Total wall clock time: 109.0 ms, total callbacks time: 105018 ms,
>> entry size 0,034 MB -> 105,02 ms per entry (latency),0,1 ms per entry
>> (throughput) 313,6 MB/s throughput
>> #4 Total wall clock time: 114.0 ms, total callbacks time: 110111 ms,
>> entry size 0,034 MB -> 110,11 ms per entry (latency),0,1 ms per entry
>> (throughput) 299,8 MB/s throughput
>> #5 Total wall clock time: 118.0 ms, total callbacks time: 111534 ms,
>> entry size 0,034 MB -> 111,53 ms per entry (latency),0,1 ms per entry
>> (throughput) 289,7 MB/s throughput
>> #6 Total wall clock time: 98.0 ms, total callbacks time: 87425 ms, entry
>> size 0,034 MB -> 87,43 ms per entry (latency),0,1 ms per entry (throughput)
>> 348,8 MB/s throughput
>> #7 Total wall clock time: 135.0 ms, total callbacks time: 129583 ms,
>> entry size 0,034 MB -> 129,58 ms per entry (latency),0,1 ms per entry
>> (throughput) 253,2 MB/s throughput
>> #8 Total wall clock time: 99.0 ms, total callbacks time: 84986 ms, entry
>> size 0,034 MB -> 84,99 ms per entry (latency),0,1 ms per entry (throughput)
>> 345,2 MB/s throughput
>> #9 Total wall clock time: 125.0 ms, total callbacks time: 117492 ms,
>> entry size 0,034 MB -> 117,49 ms per entry (latency),0,1 ms per entry
>> (throughput) 273,4 MB/s throughput
>>
>> gc.log
>> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
>> 20080428 (Red Hat 4.3.0-8)
>> Memory: 4k page, physical 32870636k(3212484k free), swap
>> 15622140k(15460604k free)
>> CommandLine flags: -XX:+CMSIncrementalMode -XX:InitialHeapSize=2147483648
>> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 <(214)%20748-3648>
>> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 -XX:NewSize=697933824
>> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC
>> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers
>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> 0,572: [GC (Allocation Failure)  545344K->8016K(2029056K), 0,0057737 secs]
>> 0,882: [GC (Allocation Failure)  553360K->88089K(2029056K), 0,0205542
>> secs]
>> 0,903: [GC (CMS Initial Mark)  93974K(2029056K), 0,0010542 secs]
>> 1,100: [GC (Allocation Failure)  633433K->94935K(2029056K), 0,0359948
>> secs]
>> 1,221: [GC (CMS Final Remark)  384627K(2029056K), 0,0079453 secs]
>> 1,313: [GC (Allocation Failure)  640279K->98309K(2029056K), 0,0157774
>> secs]
>> 1,481: [GC (Allocation Failure)  643653K->154062K(2029056K), 0,0179418
>> secs]
>> 1,677: [GC (Allocation Failure)  699406K->210685K(2029056K), 0,0312668
>> secs]
>> 1,709: [GC (CMS Initial Mark)  211186K(2029056K), 0,0006615 secs]
>> 1,894: [GC (Allocation Failure)  756029K->263420K(2029056K), 0,0168112
>> secs]
>>
>> BK 4.5 + throttle = 5000 + CMS + v3 protocol
>>
>> #0 Total wall clock time: 248.0 ms, total callbacks time: 102307 ms,
>> entry size 0,034 MB -> 102,31 ms per entry (latency),0,2 ms per entry
>> (throughput) 137,8 MB/s throughput
>> #1 Total wall clock time: 205.0 ms, total callbacks time: 12282 ms, entry
>> size 0,034 MB -> 12,28 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>> #2 Total wall clock time: 204.0 ms, total callbacks time: 6920 ms, entry
>> size 0,034 MB -> 6,92 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #3 Total wall clock time: 221.0 ms, total callbacks time: 13589 ms, entry
>> size 0,034 MB -> 13,59 ms per entry (latency),0,2 ms per entry (throughput)
>> 154,7 MB/s throughput
>> #4 Total wall clock time: 204.0 ms, total callbacks time: 6023 ms, entry
>> size 0,034 MB -> 6,02 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #5 Total wall clock time: 205.0 ms, total callbacks time: 6350 ms, entry
>> size 0,034 MB -> 6,35 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>> #6 Total wall clock time: 205.0 ms, total callbacks time: 7849 ms, entry
>> size 0,034 MB -> 7,85 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>> #7 Total wall clock time: 204.0 ms, total callbacks time: 6443 ms, entry
>> size 0,034 MB -> 6,44 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #8 Total wall clock time: 206.0 ms, total callbacks time: 7009 ms, entry
>> size 0,034 MB -> 7,01 ms per entry (latency),0,2 ms per entry (throughput)
>> 165,9 MB/s throughput
>> #9 Total wall clock time: 205.0 ms, total callbacks time: 6150 ms, entry
>> size 0,034 MB -> 6,15 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>>
>> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
>> 20080428 (Red Hat 4.3.0-8)
>> Memory: 4k page, physical 32870636k(3247600k free), swap
>> 15622140k(15460604k free)
>> CommandLine flags: -XX:+CMSIncrementalMode -XX:InitialHeapSize=2147483648
>> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 <(214)%20748-3648>
>> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 -XX:NewSize=697933824
>> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC
>> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers
>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> 0,568: [GC (Allocation Failure)  545344K->8497K(2029056K), 0,0060302 secs]
>> 1,059: [GC (Allocation Failure)  553841K->112138K(2029056K), 0,0206230
>> secs]
>> 1,080: [GC (CMS Initial Mark)  117602K(2029056K), 0,0013282 secs]
>> 1,561: [GC (Allocation Failure)  654247K->189231K(2029056K), 0,0379235
>> secs]
>> 2,079: [GC (Allocation Failure)  731215K->185468K(2029056K), 0,0107924
>> secs]
>> 2,320: [GC (CMS Final Remark)  431925K(2029056K), 0,0076070 secs]
>> 2,635: [GC (Allocation Failure)  730812K->217376K(2029056K), 0,0136162
>> secs]
>>
>>
>> BK 4.4 + throttle = 5000 + CMS + v3 protocol
>>
>> #0 Total wall clock time: 208.0 ms, total callbacks time: 55274 ms, entry
>> size 0,034 MB -> 55,27 ms per entry (latency),0,2 ms per entry (throughput)
>> 164,3 MB/s throughput
>> #1 Total wall clock time: 207.0 ms, total callbacks time: 11848 ms, entry
>> size 0,034 MB -> 11,85 ms per entry (latency),0,2 ms per entry (throughput)
>> 165,1 MB/s throughput
>> #2 Total wall clock time: 208.0 ms, total callbacks time: 9442 ms, entry
>> size 0,034 MB -> 9,44 ms per entry (latency),0,2 ms per entry (throughput)
>> 164,3 MB/s throughput
>> #3 Total wall clock time: 204.0 ms, total callbacks time: 6739 ms, entry
>> size 0,034 MB -> 6,74 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #4 Total wall clock time: 205.0 ms, total callbacks time: 7546 ms, entry
>> size 0,034 MB -> 7,55 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>> #5 Total wall clock time: 204.0 ms, total callbacks time: 6594 ms, entry
>> size 0,034 MB -> 6,59 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #6 Total wall clock time: 205.0 ms, total callbacks time: 5885 ms, entry
>> size 0,034 MB -> 5,89 ms per entry (latency),0,2 ms per entry (throughput)
>> 166,7 MB/s throughput
>> #7 Total wall clock time: 204.0 ms, total callbacks time: 7722 ms, entry
>> size 0,034 MB -> 7,72 ms per entry (latency),0,2 ms per entry (throughput)
>> 167,5 MB/s throughput
>> #8 Total wall clock time: 224.0 ms, total callbacks time: 5968 ms, entry
>> size 0,034 MB -> 5,97 ms per entry (latency),0,2 ms per entry (throughput)
>> 152,6 MB/s throughput
>> #9 Total wall clock time: 209.0 ms, total callbacks time: 6731 ms, entry
>> size 0,034 MB -> 6,73 ms per entry (latency),0,2 ms per entry (throughput)
>> 163,5 MB/s throughput
>>
>> gc.log
>> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
>> 20080428 (Red Hat 4.3.0-8)
>> Memory: 4k page, physical 32870636k(3291888k free), swap
>> 15622140k(15460604k free)
>> CommandLine flags: -XX:+CMSIncrementalMode -XX:InitialHeapSize=2147483648
>> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 <(214)%20748-3648>
>> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 -XX:NewSize=697933824
>> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC
>> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers
>> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>> 0,554: [GC (Allocation Failure)  545344K->8022K(2029056K), 0,0055210 secs]
>> 0,890: [GC (Allocation Failure)  553366K->93114K(2029056K), 0,0177003
>> secs]
>> 0,909: [GC (CMS Initial Mark)  97210K(2029056K), 0,0011810 secs]
>> 1,204: [GC (Allocation Failure)  638458K->102278K(2029056K), 0,0181779
>> secs]
>> 1,536: [GC (Allocation Failure)  645878K->103955K(2029056K), 0,0088187
>> secs]
>> 1,679: [GC (CMS Final Remark)  353595K(2029056K), 0,0078553 secs]
>> 1,858: [GC (Allocation Failure)  649299K->78654K(2029056K), 0,0064803
>> secs]
>> 2,187: [GC (Allocation Failure)  623998K->120130K(2029056K), 0,0125856
>> secs]
>> 2,200: [GC (CMS Initial Mark)  121879K(2029056K), 0,0003934 secs]
>> 2,518: [GC (Allocation Failure)  665474K->193813K(2029056K), 0,0202701
>> secs]
>>
>>
>>
>>
>>>
>>> > - using the v2 protocol brings huge improvements in the usage of memory
>>> > - if you want good "latency" for each entry you have to set at least
>>> > throttle to 1000, but the throughput will be very lower (from 30 MB/s
>>> vs
>>> > 300 MB/s on my disk)
>>> >
>>>
>>> What type of disk do you have? What is your filesystem mount options?
>>>
>>
>> Disk type: SSD
>> this is fstab line:
>> /dev/mapper/Ssd-Home    /home                   ext4
>> defaults,discard        1 2
>>
>> uname -a
>> Linux DNA101PC193.diennea.lan 4.11.5-100.fc24.x86_64 #1 SMP Wed Jun 14
>> 17:21:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>>
>>>
>>> When you did the test, how did GC activities look like when setting
>>> throttle to be more than 1000?
>>>
>>
>> I have added new results, see above
>>
>>
>>>
>>> I think the latency looked "bad" when throttle was more than 1000, is
>>> because of GC. Did you have that test results?
>>>
>>
>>
>> I apologize, in the below results when I did not write 'throttle = 0" I
>> meant throttle = 1000
>>
>> Did you take a look an the code of the bench ? Maybe a double check would
>> be useful, many times badly written benchmarks are not useful at all
>>
>> Thank you
>> -- Enrico
>>
>>
>>
>>>
>>>
>>> > below the results of my simple bench
>>> >
>>> > Enrico
>>> >
>>> > Bench code:
>>> >
>>> https://github.com/eolivelli/bookkeepers-benchs/blob/master/src/test/java/
>>> > BookKeeperWriteTest.java
>>> >
>>> > Results:
>>> > with 4.5.0-SNAPSHOT and v2 protocol - G1 GC
>>> >
>>> > #0 Total wall clock time: 1000.0 ms, total callbacks time: 8460 ms,
>>> entry
>>> > size 0,034 MB -> 8,46 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,2 MB/s throughput
>>> > #1 Total wall clock time: 1005.0 ms, total callbacks time: 4529 ms,
>>> entry
>>> > size 0,034 MB -> 4,53 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4293 ms,
>>> entry
>>> > size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1003.0 ms, total callbacks time: 4317 ms,
>>> entry
>>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4553 ms,
>>> entry
>>> > size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #5 Total wall clock time: 1001.0 ms, total callbacks time: 4448 ms,
>>> entry
>>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #6 Total wall clock time: 1004.0 ms, total callbacks time: 4449 ms,
>>> entry
>>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #7 Total wall clock time: 1003.0 ms, total callbacks time: 4504 ms,
>>> entry
>>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4290 ms,
>>> entry
>>> > size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #9 Total wall clock time: 1003.0 ms, total callbacks time: 4441 ms,
>>> entry
>>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> >
>>> > gc.log
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(9153868k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
>>> > 1,988: [GC pause (G1 Evacuation Pause) (young) 146M->59M(2048M),
>>> 0,0104300
>>> > secs]
>>> >
>>> >
>>> > Results, with 4.5.0-SNAPSHOT and v3 protocol - G1 GC
>>> > #0 Total wall clock time: 1001.0 ms, total callbacks time: 11800 ms,
>>> entry
>>> > size 0,034 MB -> 11,80 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #1 Total wall clock time: 1001.0 ms, total callbacks time: 4585 ms,
>>> entry
>>> > size 0,034 MB -> 4,59 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4458 ms,
>>> entry
>>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4471 ms,
>>> entry
>>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #4 Total wall clock time: 1004.0 ms, total callbacks time: 4473 ms,
>>> entry
>>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #5 Total wall clock time: 1003.0 ms, total callbacks time: 4477 ms,
>>> entry
>>> > size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #6 Total wall clock time: 1007.0 ms, total callbacks time: 4424 ms,
>>> entry
>>> > size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 33,9 MB/s throughput
>>> > #7 Total wall clock time: 1005.0 ms, total callbacks time: 4576 ms,
>>> entry
>>> > size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4447 ms,
>>> entry
>>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #9 Total wall clock time: 1007.0 ms, total callbacks time: 4501 ms,
>>> entry
>>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 33,9 MB/s throughput
>>> >
>>> > gc.log
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(9042512k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
>>> > 0,830: [GC pause (G1 Evacuation Pause) (young) 118M->33M(2048M),
>>> 0,0116376
>>> > secs]
>>> > 1,372: [GC pause (G1 Evacuation Pause) (young) 162M->73M(2048M),
>>> 0,0055103
>>> > secs]
>>> > 2,172: [GC pause (G1 Evacuation Pause) (young) 227M->99M(2048M),
>>> 0,0023962
>>> > secs]
>>> > 10,358: [GC pause (G1 Evacuation Pause) (young) 1620M->122M(2048M),
>>> > 0,0026519 secs]
>>> > 11,676: [GC pause (Metadata GC Threshold) (young) (initial-mark)
>>> > 242M->114M(2048M), 0,0021797 secs]
>>> > 11,678: [GC concurrent-root-region-scan-start]
>>> > 11,679: [GC concurrent-root-region-scan-end, 0,0007421 secs]
>>> > 11,679: [GC concurrent-mark-start]
>>> > 11,684: [GC concurrent-mark-end, 0,0046645 secs]
>>> > 11,684: [GC remark, 0,0027234 secs]
>>> > 11,687: [GC cleanup 115M->111M(2048M), 0,0008957 secs]
>>> > 11,688: [GC concurrent-cleanup-start]
>>> > 11,688: [GC concurrent-cleanup-end, 0,0000070 secs]
>>> >
>>> >
>>> > With 4.4.0 - v3 protocol - G1 GC
>>> > #0 Total wall clock time: 1001.0 ms, total callbacks time: 6723 ms,
>>> entry
>>> > size 0,034 MB -> 6,72 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4495 ms,
>>> entry
>>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #2 Total wall clock time: 1001.0 ms, total callbacks time: 4406 ms,
>>> entry
>>> > size 0,034 MB -> 4,41 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1005.0 ms, total callbacks time: 4457 ms,
>>> entry
>>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4388 ms,
>>> entry
>>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #5 Total wall clock time: 1004.0 ms, total callbacks time: 4451 ms,
>>> entry
>>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #6 Total wall clock time: 1003.0 ms, total callbacks time: 4574 ms,
>>> entry
>>> > size 0,034 MB -> 4,57 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #7 Total wall clock time: 1003.0 ms, total callbacks time: 4523 ms,
>>> entry
>>> > size 0,034 MB -> 4,52 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #8 Total wall clock time: 1004.0 ms, total callbacks time: 4450 ms,
>>> entry
>>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #9 Total wall clock time: 1002.0 ms, total callbacks time: 4542 ms,
>>> entry
>>> > size 0,034 MB -> 4,54 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> >
>>> > gc.log
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8931792k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
>>> > 0,750: [GC pause (G1 Evacuation Pause) (young) 106M->14M(2048M),
>>> 0,0070948
>>> > secs]
>>> > 1,089: [GC pause (G1 Evacuation Pause) (young) 127M->21M(2048M),
>>> 0,0038837
>>> > secs]
>>> > 1,608: [GC pause (G1 Evacuation Pause) (young) 193M->41M(2048M),
>>> 0,0039903
>>> > secs]
>>> > 3,745: [GC pause (G1 Evacuation Pause) (young) 733M->49M(2048M),
>>> 0,0046991
>>> > secs]
>>> > 7,990: [GC pause (G1 Evacuation Pause) (young) 1423M->65M(2048M),
>>> 0,0052844
>>> > secs]
>>> >
>>> >
>>> >
>>> > with 4.4 - v3 protocol Parallel GC
>>> >
>>> > #0 Total wall clock time: 999.0 ms, total callbacks time: 6007 ms,
>>> entry
>>> > size 0,034 MB -> 6,01 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,2 MB/s throughput
>>> > #1 Total wall clock time: 1001.0 ms, total callbacks time: 4813 ms,
>>> entry
>>> > size 0,034 MB -> 4,81 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #2 Total wall clock time: 1003.0 ms, total callbacks time: 4580 ms,
>>> entry
>>> > size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4349 ms,
>>> entry
>>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4456 ms,
>>> entry
>>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #5 Total wall clock time: 1006.0 ms, total callbacks time: 4504 ms,
>>> entry
>>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #6 Total wall clock time: 1003.0 ms, total callbacks time: 4387 ms,
>>> entry
>>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #7 Total wall clock time: 1002.0 ms, total callbacks time: 4643 ms,
>>> entry
>>> > size 0,034 MB -> 4,64 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4548 ms,
>>> entry
>>> > size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #9 Total wall clock time: 1001.0 ms, total callbacks time: 4633 ms,
>>> entry
>>> > size 0,034 MB -> 4,63 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> >
>>> > gc.log
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8877776k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
>>> -XX:+UseParallelGC
>>> > 0,603: [GC (Allocation Failure)  524800K->8488K(2010112K), 0,0056746
>>> secs]
>>> > 1,996: [GC (Allocation Failure)  533268K->181534K(2010112K), 0,0248460
>>> > secs]
>>> > 3,475: [GC (Allocation Failure)  703337K->237612K(2010112K), 0,0192195
>>> > secs]
>>> > 4,963: [GC (Allocation Failure)  762412K->212913K(2010112K), 0,0085357
>>> > secs]
>>> > 6,473: [GC (Allocation Failure)  737684K->192597K(1951232K), 0,0065437
>>> > secs]
>>> > 8,019: [GC (Allocation Failure)  704597K->319617K(1863680K), 0,0189275
>>> > secs]
>>> > 9,008: [GC (Allocation Failure)  651905K->315463K(1853952K), 0,0121369
>>> > secs]
>>> > 9,997: [GC (Allocation Failure)  647729K->418132K(1848832K), 0,0222260
>>> > secs]
>>> > 10,742: [GC (Allocation Failure)  679764K->340097K(1878528K), 0,0111347
>>> > secs]
>>> >
>>> >
>>> > with 4.5 - v2 protocol Parallel GC
>>> >
>>> > #0 Total wall clock time: 1004.0 ms, total callbacks time: 8030 ms,
>>> entry
>>> > size 0,034 MB -> 8,03 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4394 ms,
>>> entry
>>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4417 ms,
>>> entry
>>> > size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1005.0 ms, total callbacks time: 4362 ms,
>>> entry
>>> > size 0,034 MB -> 4,36 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #4 Total wall clock time: 1003.0 ms, total callbacks time: 4435 ms,
>>> entry
>>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #5 Total wall clock time: 1002.0 ms, total callbacks time: 4489 ms,
>>> entry
>>> > size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 ms,
>>> entry
>>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #7 Total wall clock time: 1004.0 ms, total callbacks time: 4474 ms,
>>> entry
>>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #8 Total wall clock time: 1005.0 ms, total callbacks time: 4436 ms,
>>> entry
>>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #9 Total wall clock time: 1001.0 ms, total callbacks time: 4322 ms,
>>> entry
>>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> >
>>> > gc.log
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8951980k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
>>> -XX:+UseParallelGC
>>> > 0,583: [GC (Allocation Failure)  524800K->8921K(2010112K), 0,0054470
>>> secs]
>>> > 6,397: [GC (Allocation Failure)  533721K->178956K(2010112K), 0,0306493
>>> secs
>>> >
>>> >
>>> > with 4.5 - v3 protocol Parallel GC
>>> >
>>> > #0 Total wall clock time: 1002.0 ms, total callbacks time: 10679 ms,
>>> entry
>>> > size 0,034 MB -> 10,68 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4802 ms,
>>> entry
>>> > size 0,034 MB -> 4,80 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #2 Total wall clock time: 1003.0 ms, total callbacks time: 4479 ms,
>>> entry
>>> > size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4726 ms,
>>> entry
>>> > size 0,034 MB -> 4,73 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #4 Total wall clock time: 1007.0 ms, total callbacks time: 4393 ms,
>>> entry
>>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 33,9 MB/s throughput
>>> > #5 Total wall clock time: 1006.0 ms, total callbacks time: 4619 ms,
>>> entry
>>> > size 0,034 MB -> 4,62 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 ms,
>>> entry
>>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #7 Total wall clock time: 1002.0 ms, total callbacks time: 4402 ms,
>>> entry
>>> > size 0,034 MB -> 4,40 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,1 MB/s throughput
>>> > #8 Total wall clock time: 1004.0 ms, total callbacks time: 4487 ms,
>>> entry
>>> > size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 34,0 MB/s throughput
>>> > #9 Total wall clock time: 1007.0 ms, total callbacks time: 4315 ms,
>>> entry
>>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry
>>> (throughput)
>>> > 33,9 MB/s throughput
>>> >
>>> > gc.log
>>> >
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8921388k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
>>> -XX:+UseParallelGC
>>> > 0,563: [GC (Allocation Failure)  524800K->8935K(2010112K), 0,0055378
>>> secs]
>>> > 1,936: [GC (Allocation Failure)  533735K->187494K(2010112K), 0,0281195
>>> > secs]
>>> > 3,917: [GC (Allocation Failure)  712294K->275491K(2010112K), 0,0298168
>>> > secs]
>>> > 6,385: [GC (Allocation Failure)  798141K->394430K(2010112K), 0,0279730
>>> > secs]
>>> > 8,934: [GC (Allocation Failure)  919230K->394530K(2008576K), 0,0127794
>>> > secs]
>>> > 11,634: [GC (Metadata GC Threshold)  811154K->348997K(1868288K),
>>> 0,0080161
>>> > secs]
>>> > 11,642: [Full GC (Metadata GC Threshold)  348997K->39874K(1868288K),
>>> > 0,0247462 secs]
>>> >
>>> >
>>> > 4.5 v3 protocol parallel GC throttle = 0
>>> >
>>> > #0 Total wall clock time: 334.0 ms, total callbacks time: 267444 ms,
>>> entry
>>> > size 0,034 MB -> 267,44 ms per entry (latency),0,3 ms per entry
>>> > (throughput) 102,3 MB/s throughput
>>> > #1 Total wall clock time: 189.0 ms, total callbacks time: 154428 ms,
>>> entry
>>> > size 0,034 MB -> 154,43 ms per entry (latency),0,2 ms per entry
>>> > (throughput) 180,8 MB/s throughput
>>> > #2 Total wall clock time: 136.0 ms, total callbacks time: 103012 ms,
>>> entry
>>> > size 0,034 MB -> 103,01 ms per entry (latency),0,1 ms per entry
>>> > (throughput) 251,3 MB/s throughput
>>> > #3 Total wall clock time: 156.0 ms, total callbacks time: 113315 ms,
>>> entry
>>> > size 0,034 MB -> 113,32 ms per entry (latency),0,2 ms per entry
>>> > (throughput) 219,1 MB/s throughput
>>> > #4 Total wall clock time: 110.0 ms, total callbacks time: 79296 ms,
>>> entry
>>> > size 0,034 MB -> 79,30 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 310,7 MB/s throughput
>>> > #5 Total wall clock time: 104.0 ms, total callbacks time: 71213 ms,
>>> entry
>>> > size 0,034 MB -> 71,21 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 328,7 MB/s throughput
>>> > #6 Total wall clock time: 161.0 ms, total callbacks time: 72447 ms,
>>> entry
>>> > size 0,034 MB -> 72,45 ms per entry (latency),0,2 ms per entry
>>> (throughput)
>>> > 212,3 MB/s throughput
>>> > #7 Total wall clock time: 103.0 ms, total callbacks time: 82783 ms,
>>> entry
>>> > size 0,034 MB -> 82,78 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 331,8 MB/s throughput
>>> > #8 Total wall clock time: 138.0 ms, total callbacks time: 106086 ms,
>>> entry
>>> > size 0,034 MB -> 106,09 ms per entry (latency),0,1 ms per entry
>>> > (throughput) 247,7 MB/s throughput
>>> > #9 Total wall clock time: 118.0 ms, total callbacks time: 77750 ms,
>>> entry
>>> > size 0,034 MB -> 77,75 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 289,7 MB/s throughput
>>> >
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8858832k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops
>>> -XX:+UseParallelGC
>>> > 0,660: [GC (Allocation Failure)  524800K->8955K(2010112K), 0,0061442
>>> secs]
>>> > 1,288: [GC (Allocation Failure)  533726K->191538K(2010112K), 0,0329663
>>> > secs]
>>> > 1,634: [GC (Allocation Failure)  716338K->349468K(2010112K), 0,0508922
>>> > secs]
>>> > 1,942: [GC (Allocation Failure)  874268K->671717K(2010112K), 0,0609749
>>> > secs]
>>> > 2,266: [GC (Allocation Failure)  1196517K->781335K(2010112K), 0,0320305
>>> > secs]
>>> >
>>> > 4.5 v2protocol throttle = 0 G1 GC
>>> >
>>> > #0 Total wall clock time: 200.0 ms, total callbacks time: 173682 ms,
>>> entry
>>> > size 0,034 MB -> 173,68 ms per entry (latency),0,2 ms per entry
>>> > (throughput) 170,9 MB/s throughput
>>> > #1 Total wall clock time: 128.0 ms, total callbacks time: 89987 ms,
>>> entry
>>> > size 0,034 MB -> 89,99 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 267,0 MB/s throughput
>>> > #2 Total wall clock time: 117.0 ms, total callbacks time: 86411 ms,
>>> entry
>>> > size 0,034 MB -> 86,41 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 292,1 MB/s throughput
>>> > #3 Total wall clock time: 95.0 ms, total callbacks time: 64054 ms,
>>> entry
>>> > size 0,034 MB -> 64,05 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 359,8 MB/s throughput
>>> > #4 Total wall clock time: 92.0 ms, total callbacks time: 60863 ms,
>>> entry
>>> > size 0,034 MB -> 60,86 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 371,5 MB/s throughput
>>> > #5 Total wall clock time: 92.0 ms, total callbacks time: 62026 ms,
>>> entry
>>> > size 0,034 MB -> 62,03 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 371,5 MB/s throughput
>>> > #6 Total wall clock time: 93.0 ms, total callbacks time: 65033 ms,
>>> entry
>>> > size 0,034 MB -> 65,03 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 367,5 MB/s throughput
>>> > #7 Total wall clock time: 88.0 ms, total callbacks time: 59851 ms,
>>> entry
>>> > size 0,034 MB -> 59,85 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 388,4 MB/s throughput
>>> > #8 Total wall clock time: 93.0 ms, total callbacks time: 61816 ms,
>>> entry
>>> > size 0,034 MB -> 61,82 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 367,5 MB/s throughput
>>> > #9 Total wall clock time: 93.0 ms, total callbacks time: 70397 ms,
>>> entry
>>> > size 0,034 MB -> 70,40 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 367,5 MB/s throughput
>>> >
>>> >
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8894828k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
>>> > 1,078: [GC pause (G1 Evacuation Pause) (young) 150M->65M(2048M),
>>> 0,0138100
>>> > secs]
>>> >
>>> > bk 4.5 v3 protocol G1 GC throttle = 0
>>> >
>>> > #0 Total wall clock time: 275.0 ms, total callbacks time: 238043 ms,
>>> entry
>>> > size 0,034 MB -> 238,04 ms per entry (latency),0,3 ms per entry
>>> > (throughput) 124,3 MB/s throughput
>>> > #1 Total wall clock time: 148.0 ms, total callbacks time: 106877 ms,
>>> entry
>>> > size 0,034 MB -> 106,88 ms per entry (latency),0,1 ms per entry
>>> > (throughput) 230,9 MB/s throughput
>>> > #2 Total wall clock time: 130.0 ms, total callbacks time: 96448 ms,
>>> entry
>>> > size 0,034 MB -> 96,45 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 262,9 MB/s throughput
>>> > #3 Total wall clock time: 118.0 ms, total callbacks time: 78246 ms,
>>> entry
>>> > size 0,034 MB -> 78,25 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 289,7 MB/s throughput
>>> > #4 Total wall clock time: 109.0 ms, total callbacks time: 79110 ms,
>>> entry
>>> > size 0,034 MB -> 79,11 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 313,6 MB/s throughput
>>> > #5 Total wall clock time: 124.0 ms, total callbacks time: 86652 ms,
>>> entry
>>> > size 0,034 MB -> 86,65 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 275,6 MB/s throughput
>>> > #6 Total wall clock time: 110.0 ms, total callbacks time: 76729 ms,
>>> entry
>>> > size 0,034 MB -> 76,73 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 310,7 MB/s throughput
>>> > #7 Total wall clock time: 110.0 ms, total callbacks time: 80104 ms,
>>> entry
>>> > size 0,034 MB -> 80,10 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 310,7 MB/s throughput
>>> > #8 Total wall clock time: 104.0 ms, total callbacks time: 73514 ms,
>>> entry
>>> > size 0,034 MB -> 73,51 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 328,7 MB/s throughput
>>> > #9 Total wall clock time: 102.0 ms, total callbacks time: 74487 ms,
>>> entry
>>> > size 0,034 MB -> 74,49 ms per entry (latency),0,1 ms per entry
>>> (throughput)
>>> > 335,1 MB/s throughput
>>> >
>>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
>>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc
>>> 4.3.0
>>> > 20080428 (Red Hat 4.3.0-8)
>>> > Memory: 4k page, physical 32870636k(8958604k free), swap
>>> > 15622140k(15464700k free)
>>> > CommandLine flags: -XX:InitialHeapSize=2147483648 <(214)%20748-3648>
>>> > -XX:MaxHeapSize=2147483648 <(214)%20748-3648> -XX:+PrintGC
>>> -XX:+PrintGCTimeStamps
>>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
>>> > 0,876: [GC pause (G1 Evacuation Pause) (young) 102M->19M(2048M),
>>> 0,0107143
>>> > secs]
>>> > 0,989: [GC pause (G1 Evacuation Pause) (young) 152M->81M(2048M),
>>> 0,0074561
>>> > secs]
>>> > 1,125: [GC pause (G1 Evacuation Pause) (young) 198M->108M(2048M),
>>> 0,0038296
>>> > secs]
>>> > 1,286: [GC pause (G1 Evacuation Pause) (young) 306M->146M(2048M),
>>> 0,0034297
>>> > secs]
>>> >
>>> >
>>> >
>>> > 2017-07-12 3:07 GMT+02:00 Sijie Guo <[email protected]>:
>>> >
>>> > > On Wed, Jul 12, 2017 at 2:04 AM, Venkateswara Rao Jujjuri <
>>> > > [email protected]
>>> > > > wrote:
>>> > >
>>> > > > Enrico, let me try to paraphrase the issue.
>>> > > >
>>> > > > - With G1GC + Netty 4.1 is giving you roughly same perf as prev
>>> > release.
>>> > > Is
>>> > > > that accurate statement?
>>> > > > But you are still seeing latency spikes with Netty 4.1??
>>> > >
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > > I did not fully
>>> > > > understand your sleep usecase.
>>> > > > How sleep is yielding better latency?
>>> > > >
>>> > >
>>> > > I think the "sleep(1)" in Enrico's benchmark is for throttling. It is
>>> > same
>>> > > as setting 'throttle' value to 1000 in his test.
>>> > >
>>> > > - Sijie
>>> > >
>>> > >
>>> > >
>>> > > >
>>> > > > Thanks,
>>> > > > JV
>>> > > >
>>> > > > On Tue, Jul 11, 2017 at 8:27 AM, Enrico Olivelli <
>>> [email protected]>
>>> > > > wrote:
>>> > > >
>>> > > > > Another interesting thing...during my profiling activity I gave a
>>> > > chance
>>> > > > to
>>> > > > > the old v2 protocol and activated the gc logs, as expected the
>>> result
>>> > > is
>>> > > > > that with v2 protocol there is almost no GC activity during the
>>> > > benchmark
>>> > > > > -- Enrico
>>> > > > >
>>> > > > > 2017-07-11 12:07 GMT+02:00 Enrico Olivelli <[email protected]
>>> >:
>>> > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > > 2017-07-11 11:04 GMT+02:00 Sijie Guo <[email protected]>:
>>> > > > > >
>>> > > > > >> I think Netty4 requires more offheap memory. you might need to
>>> > tune
>>> > > > the
>>> > > > > >> JVM
>>> > > > > >> settings. I doubt that latency diff coming from the JVM gc.
>>> > > > > >>
>>> > > > > >> A simple thing to verify that is to dump the gc log by adding
>>> "
>>> > > > > -Xloggc:"
>>> > > > > >> setting and compare the gc logs between versions.
>>> > > > > >>
>>> > > > > >
>>> > > > > >
>>> > > > > > Enabling G1 GC (-XX:+UseG1GC) apparently solves the difference
>>> > > between
>>> > > > > 4.4
>>> > > > > > and 4.5.
>>> > > > > >
>>> > > > > > And I a loop the bench (in order to have a warm JVM) numbers
>>> become
>>> > > > more
>>> > > > > > stable and similar to each other
>>> > > > > >
>>> > > > > > I have the "other issue" pending, the fact that latency (time
>>> for
>>> > > > > > asynchAddEntry to complete) is really bad and unpredictable
>>> (from
>>> > 100
>>> > > > ms
>>> > > > > to
>>> > > > > > 2000ms)
>>> > > > > >
>>> > > > > > and if I introduce a Thread.sleep(1) all the callbacks complete
>>> > > always
>>> > > > > > with success in < 10ms. This happens even with G1 GC.
>>> > > > > >
>>> > > > > > Without the "sleep(1)" my machine uses a lot of CPU (I have 8
>>> > "CPUs")
>>> > > > and
>>> > > > > > with the "sleep(1)" the load is slightly lower
>>> > > > > >
>>> > > > > > Honestly for me this is still a problem and I hope that with
>>> your
>>> > > help
>>> > > > I
>>> > > > > > will be able to find the problem, wherever it is (in BK code
>>> or in
>>> > > the
>>> > > > > way
>>> > > > > > I am doing the bench)
>>> > > > > >
>>> > > > > > I will try to create a new more complete bench
>>> > > > > >
>>> > > > > > -- Enrico
>>> > > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > >
>>> > > > > >>
>>> > > > > >> - Sijie
>>> > > > > >>
>>> > > > > >> On Tue, Jul 11, 2017 at 12:16 AM, Enrico Olivelli <
>>> > > > [email protected]>
>>> > > > > >> wrote:
>>> > > > > >>
>>> > > > > >> > a did a bisect and the culprit (in my opinion) is the
>>> switch to
>>> > > > netty
>>> > > > > 4
>>> > > > > >> for
>>> > > > > >> > the performance regression from 4.5 and 4.4
>>> > > > > >> >
>>> > > > > >> > at commit:
>>> > > > > >> > commit 811ece53a1c975c4e768422f3d622ac9de6b3e41
>>> > > BOOKKEEPER-1058:
>>> > > > > >> Ignore
>>> > > > > >> > already deleted ledger on replication audit
>>> > > > > >> >
>>> > > > > >> > Total time: 204 ms
>>> > > > > >> > Total real time: 79 ms per entry
>>> > > > > >> >
>>> > > > > >> > at commit:
>>> > > > > >> > commit 74f795136c1fff3badb29fc982d0cc2d43096b45
>>> > BOOKKEEPER-1008:
>>> > > > > Netty
>>> > > > > >> 4.1
>>> > > > > >> >
>>> > > > > >> > Total time: 308 ms
>>> > > > > >> > Total real time: 189 ms per entry
>>> > > > > >> >
>>> > > > > >> > I have tried with epoll and with local transport, results
>>> does
>>> > not
>>> > > > > >> change.
>>> > > > > >> > I tried to upgrade to netty 4.1.13 too, but no change
>>> > > > > >> >
>>> > > > > >> > Could it be  the memory allocator of netty which is
>>> overwhelmed
>>> > > with
>>> > > > > >> sudden
>>> > > > > >> > bursts of allocation ?
>>> > > > > >> > I did some trial with UnpooledByteBufAllocator.DEFAULT and
>>> it
>>> > > > helps a
>>> > > > > >> > little, we get to "110 ms per entry" vs "189 ms per entry"
>>> > > > > >> >
>>> > > > > >> > the bench is here:
>>> > > > > >> >
>>> https://github.com/eolivelli/bookkeepers-benchs/blob/master/
>>> > > > > >> src/test/java/
>>> > > > > >> > BookKeeperWriteTest.java
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > -- Enrico
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > 2017-07-10 19:46 GMT+02:00 Enrico Olivelli <
>>> [email protected]
>>> > >:
>>> > > > > >> >
>>> > > > > >> > >
>>> > > > > >> > >
>>> > > > > >> > > Il lun 10 lug 2017, 18:21 Venkateswara Rao Jujjuri <
>>> > > > > [email protected]
>>> > > > > >> >
>>> > > > > >> > ha
>>> > > > > >> > > scritto:
>>> > > > > >> > >
>>> > > > > >> > >> With Netty changes, lack of native epoll() has huge perf
>>> > impact
>>> > > > as
>>> > > > > >> per
>>> > > > > >> > >> Kishore.
>>> > > > > >> > >> Are you sure you are using epoll()?
>>> > > > > >> > >>
>>> > > > > >> > >
>>> > > > > >> > > Yes. I tried with netty local transport too. It seems not
>>> > > related
>>> > > > to
>>> > > > > >> > netty
>>> > > > > >> > > to me.
>>> > > > > >> > > I will double check, tomorrow
>>> > > > > >> > > Enrico
>>> > > > > >> > >
>>> > > > > >> > >
>>> > > > > >> > >> On Mon, Jul 10, 2017 at 1:49 AM, Enrico Olivelli <
>>> > > > > >> [email protected]>
>>> > > > > >> > >> wrote:
>>> > > > > >> > >>
>>> > > > > >> > >> > 2017-07-10 10:40 GMT+02:00 Sijie Guo <
>>> [email protected]>:
>>> > > > > >> > >> >
>>> > > > > >> > >> > > Also one other thing to check is the JVM settings.
>>> Do you
>>> > > > mind
>>> > > > > >> > sharing
>>> > > > > >> > >> > that
>>> > > > > >> > >> > > as well?
>>> > > > > >> > >> > >
>>> > > > > >> > >> > >
>>> > > > > >> > >> > this is the surefire config, I am using oracle jdk 8
>>> > > > > >> > >> >
>>> > > > > >> > >> >              <plugin>
>>> > > > > >> > >> >                 <artifactId>maven-surefire-
>>> > > plugin</artifactId>
>>> > > > > >> > >> >                 <version>2.20</version>
>>> > > > > >> > >> >                 <configuration>
>>> > > > > >> > >> >                     <forkCount>1</forkCount>
>>> > > > > >> > >> >                     <reuseForks>false</reuseForks>
>>> > > > > >> > >> >
>>> > > > > >> > >> > <forkedProcessTimeoutInSeconds>300</
>>> > > > > forkedProcessTimeoutInSeconds>
>>> > > > > >> > >> >                     <argLine>-Xmx2G
>>> > > > > >> > >> > -Djava.io.tmpdir=${basedir}/target</argLine>
>>> > > > > >> > >> >                 </configuration>
>>> > > > > >> > >> >             </plugin>
>>> > > > > >> > >> >
>>> > > > > >> > >> > -- Enrico
>>> > > > > >> > >> >
>>> > > > > >> > >> >
>>> > > > > >> > >> >
>>> > > > > >> > >> > > Sijie
>>> > > > > >> > >> > >
>>> > > > > >> > >> > > On Jul 10, 2017 1:17 AM, "Sijie Guo" <
>>> [email protected]
>>> > >
>>> > > > > wrote:
>>> > > > > >> > >> > >
>>> > > > > >> > >> > > > I am not sure if there is any default values
>>> changed
>>> > for
>>> > > > > >> journal
>>> > > > > >> > >> > > settings.
>>> > > > > >> > >> > > > I would suggest you testing by setting
>>> specifically the
>>> > > > > journal
>>> > > > > >> > >> > settings.
>>> > > > > >> > >> > > >
>>> > > > > >> > >> > > > Also if you can share your benchmark, that would be
>>> > good
>>> > > > for
>>> > > > > >> other
>>> > > > > >> > >> > people
>>> > > > > >> > >> > > > to verify.
>>> > > > > >> > >> > > >
>>> > > > > >> > >> > > > Sijie
>>> > > > > >> > >> > > >
>>> > > > > >> > >> > > > On Jul 10, 2017 12:32 AM, "Enrico Olivelli" <
>>> > > > > >> [email protected]>
>>> > > > > >> > >> > wrote:
>>> > > > > >> > >> > > >
>>> > > > > >> > >> > > >> Hi,
>>> > > > > >> > >> > > >> I am doing some benchmarks on BK, I see that from
>>> > 4.4.0
>>> > > to
>>> > > > > >> 4.5.0
>>> > > > > >> > >> there
>>> > > > > >> > >> > > is
>>> > > > > >> > >> > > >> something "slow" but I cannot understand what. I
>>> > really
>>> > > > hope
>>> > > > > >> that
>>> > > > > >> > >> I am
>>> > > > > >> > >> > > >> wrong.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> I am working with writes, I will pass to reads
>>> once
>>> > > writes
>>> > > > > >> will
>>> > > > > >> > be
>>> > > > > >> > >> ok.
>>> > > > > >> > >> > > >> My problem is both on latency (time for
>>> AddComplete
>>> > > > callback
>>> > > > > >> to
>>> > > > > >> > >> > > complete)
>>> > > > > >> > >> > > >> and on overall throuput.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Actually I have two distinct problems, but
>>> working on
>>> > > the
>>> > > > > >> first
>>> > > > > >> > >> > problem
>>> > > > > >> > >> > > I
>>> > > > > >> > >> > > >> found a performance regression.
>>> > > > > >> > >> > > >> I know that talking about "slow" things it is an
>>> hard
>>> > > > > matter,
>>> > > > > >> so
>>> > > > > >> > I
>>> > > > > >> > >> > will
>>> > > > > >> > >> > > >> try
>>> > > > > >> > >> > > >> do describe as much as possible all the aspects
>>> that I
>>> > > > think
>>> > > > > >> are
>>> > > > > >> > >> > > relevant.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> First problem: under certain load performance
>>> > > > > >> > (latency+throughput)
>>> > > > > >> > >> > > degrade
>>> > > > > >> > >> > > >> too much
>>> > > > > >> > >> > > >> Second problem: the first problem is more evident
>>> in
>>> > > 4.5.0
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Let's describe my testcase and why I am worried.
>>> > > > > >> > >> > > >> The bench issues a batch of asyncAddEntry and
>>> prints
>>> > the
>>> > > > > >> average
>>> > > > > >> > >> time
>>> > > > > >> > >> > > for
>>> > > > > >> > >> > > >> AddComplete to complete and the overall clock
>>> time.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> This is the code
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> private static final byte[] TEST_DATA = new
>>> byte[35 *
>>> > > > 1024];
>>> > > > > >> > >> > > >> private static final int testsize = 1000;
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> ...... (start 1 bookie, see below)
>>> > > > > >> > >> > > >>             ClientConfiguration
>>> clientConfiguration =
>>> > > new
>>> > > > > >> > >> > > >> ClientConfiguration();
>>> > > > > >> > >> > > >>             clientConfiguration.setZkServ
>>> > > > > >> ers(env.getAddress());
>>> > > > > >> > >> > > >>             try (BookKeeper bk = new BookKeeper(
>>> > > > > >> > >> clientConfiguration);
>>> > > > > >> > >> > > >>                 LedgerHandle lh =
>>> bk.createLedger(1,
>>> > 1,
>>> > > 1,
>>> > > > > >> > >> > > >> BookKeeper.DigestType.CRC32, new byte[0])) {
>>> > > > > >> > >> > > >>                 LongAdder totalTime = new
>>> LongAdder();
>>> > > > > >> > >> > > >>                 long _start =
>>> > > System.currentTimeMillis();
>>> > > > > >> > >> > > >>                 Collection<CompletableFuture>
>>> batch =
>>> > > new
>>> > > > > >> > >> > > >> ConcurrentLinkedQueue<>();
>>> > > > > >> > >> > > >>                 for (int i = 0; i < testsize;
>>> i++) {
>>> > > > > >> > >> > > >>                     CompletableFuture cf = new
>>> > > > > >> > CompletableFuture();
>>> > > > > >> > >> > > >>                     batch.add(cf);
>>> > > > > >> > >> > > >>                     lh.asyncAddEntry(TEST_DATA,
>>> new
>>> > > > > >> > >> > > >> AsyncCallback.AddCallback() {
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >>                         long start =
>>> > > > > >> System.currentTimeMillis();
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >>                         @Override
>>> > > > > >> > >> > > >>                         public void
>>> addComplete(int
>>> > rc,
>>> > > > > >> > >> LedgerHandle
>>> > > > > >> > >> > lh,
>>> > > > > >> > >> > > >> long entryId, Object ctx) {
>>> > > > > >> > >> > > >>                             long now =
>>> > > > > >> > >> > > >> System.currentTimeMillis();
>>> > > > > >> > >> > > >>                             CompletableFuture _cf
>>> =
>>> > > > > >> > >> > (CompletableFuture)
>>> > > > > >> > >> > > >> ctx;
>>> > > > > >> > >> > > >>                             if (rc ==
>>> > > > BKException.Code.OK) {
>>> > > > > >> > >> > > >>                                 _cf.complete("");
>>> > > > > >> > >> > > >>                             } else {
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> _cf.completeExceptionally(BKException.create(rc));
>>> > > > > >> > >> > > >>                             }
>>> > > > > >> > >> > > >>                             totalTime.add(now -
>>> > start);
>>> > > > > >> > >> > > >>                         }
>>> > > > > >> > >> > > >>                     }, cf);
>>> > > > > >> > >> > > >> //                    Thread.sleep(1);      //
>>> this is
>>> > > the
>>> > > > > >> > >> tirgger!!!
>>> > > > > >> > >> > > >>                 }
>>> > > > > >> > >> > > >>                 assertEquals(testsize,
>>> batch.size());
>>> > > > > >> > >> > > >>                 for (CompletableFuture f : batch)
>>> {
>>> > > > > >> > >> > > >>                     f.get();
>>> > > > > >> > >> > > >>                 }
>>> > > > > >> > >> > > >>                 long _stop =
>>> > System.currentTimeMillis();
>>> > > > > >> > >> > > >>                 long delta = _stop - _start;
>>> > > > > >> > >> > > >>                 System.out.println("Total time: "
>>> +
>>> > > delta
>>> > > > +
>>> > > > > "
>>> > > > > >> > ms");
>>> > > > > >> > >> > > >>                 System.out.println("Total real
>>> time:
>>> > " +
>>> > > > > >> > >> > > totalTime.sum() +
>>> > > > > >> > >> > > >> " ms -> "+(totalTime.sum()/testsize)+" ms per
>>> entry");
>>> > > > > >> > >> > > >>             }
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Bookie config:
>>> > > > > >> > >> > > >>         ServerConfiguration conf = new
>>> > > > > ServerConfiguration();
>>> > > > > >> > >> > > >>         conf.setBookiePort(5621);
>>> > > > > >> > >> > > >>         conf.setUseHostNameAsBookieID(true);
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >>         Path targetDir =
>>> path.resolve("bookie_data");
>>> > > > > >> > >> > > >>         conf.setZkServers("localhost:1282");
>>> > > > > >> > >> > > >>         conf.setLedgerDirNames(new
>>> > > > > >> > >> > > >> String[]{targetDir.toAbsolutePath().toString()});
>>> > > > > >> > >> > > >>         conf.setJournalDirName(
>>> > > > targetDir.toAbsolutePath().
>>> > > > > >> > >> > toString());
>>> > > > > >> > >> > > >>         conf.setFlushInterval(1000);
>>> > > > > >> > >> > > >>         conf.setJournalFlushWhenQueueEmpty(true);
>>> > > > > >> > >> > > >>
>>>  conf.setProperty("journalMaxGroupWaitMSec",
>>> > 0);
>>> > > > > >> > >> > > >>         conf.setProperty("
>>> > > journalBufferedWritesThreshold
>>> > > > ",
>>> > > > > >> > 1024);
>>> > > > > >> > >> > > >>         conf.setAutoRecoveryDaemonEnabled(false);
>>> > > > > >> > >> > > >>         conf.setEnableLocalTransport(true);
>>> > > > > >> > >> > > >>         conf.setAllowLoopback(true);
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> The tests starts one ZK server + 1 Bookie + the
>>> > testcase
>>> > > > in
>>> > > > > a
>>> > > > > >> > JUnit
>>> > > > > >> > >> > test
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Results:
>>> > > > > >> > >> > > >> A - BK-4.4.0:
>>> > > > > >> > >> > > >> Total time: 209 ms
>>> > > > > >> > >> > > >> Total real time: 194337 ms -> 194 ms per entry
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> B - BK-4.5.0-SNAPSHOT:
>>> > > > > >> > >> > > >> Total time: 269 ms
>>> > > > > >> > >> > > >> Total real time: 239918 ms -> 239 ms per entry
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> C - BK-4.4,0 with sleep(1):
>>> > > > > >> > >> > > >> Total time: 1113 ms (1000 ms sleep time)
>>> > > > > >> > >> > > >> Total real time: 4238 ms  -> 4 ms per entry
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> D - BK-4.5,0-SNAPSHOT with sleep(1):
>>> > > > > >> > >> > > >> Total time: 1121 ms (1000 ms sleep time)
>>> > > > > >> > >> > > >> Total real time: 8018 ms -> 8 ms per entry
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Problem 1 (unexpected performance degradation):
>>> > > > > >> > >> > > >> Times per entry (latency) are incredibly slow in
>>> > cases A
>>> > > > and
>>> > > > > >> B.
>>> > > > > >> > >> > > >> If I add a sleep(1) between one call of
>>> asyncAddEntry
>>> > > and
>>> > > > > the
>>> > > > > >> > next
>>> > > > > >> > >> > > >> "latency" is around 4 ms per entry.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Problem 2: worse performance on 4.5.0
>>> > > > > >> > >> > > >> Compare A vs B and C vs D, it is self-explaining.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> I am running the test on my laptop, with linux
>>> 64bit
>>> > > > > >> (Fedora), 12
>>> > > > > >> > >> GB
>>> > > > > >> > >> > > RAM,
>>> > > > > >> > >> > > >> no swap, on an SSD disk. The results are similar
>>> on
>>> > > other
>>> > > > > >> > >> computers.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> It seems that if I issue too many addEntry the
>>> systems
>>> > > > slows
>>> > > > > >> > down.
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Please note this fact:
>>> > > > > >> > >> > > >> numbers for case A and B (without sleep) mean
>>> that all
>>> > > the
>>> > > > > >> adds
>>> > > > > >> > got
>>> > > > > >> > >> > > >> completed almost together
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> for the 4.5 vs 4.4 case:
>>> > > > > >> > >> > > >> I tried to disable all of the threadpool
>>> enhancements
>>> > > > > >> (different
>>> > > > > >> > >> > > >> read/write
>>> > > > > >> > >> > > >> pools)....it makes not difference
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Questions:
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Is the "grouping" logic of the journal ?
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Is there a way of making a burst of 1000 async
>>> writes
>>> > on
>>> > > > the
>>> > > > > >> same
>>> > > > > >> > >> > ledger
>>> > > > > >> > >> > > >> perform <10 ms latency ?  (in my real case I have
>>> > bursts
>>> > > > of
>>> > > > > >> > >> concurrent
>>> > > > > >> > >> > > >> writes from different threads)
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Why 4.5.0 is anyway slower ?
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> Thanks
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >> -- Enrico
>>> > > > > >> > >> > > >>
>>> > > > > >> > >> > > >
>>> > > > > >> > >> > >
>>> > > > > >> > >> >
>>> > > > > >> > >>
>>> > > > > >> > >>
>>> > > > > >> > >>
>>> > > > > >> > >> --
>>> > > > > >> > >> Jvrao
>>> > > > > >> > >> ---
>>> > > > > >> > >> First they ignore you, then they laugh at you, then they
>>> > fight
>>> > > > you,
>>> > > > > >> then
>>> > > > > >> > >> you win. - Mahatma Gandhi
>>> > > > > >> > >>
>>> > > > > >> > > --
>>> > > > > >> > >
>>> > > > > >> > >
>>> > > > > >> > > -- Enrico Olivelli
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >>
>>> > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > --
>>> > > > Jvrao
>>> > > > ---
>>> > > > First they ignore you, then they laugh at you, then they fight you,
>>> > then
>>> > > > you win. - Mahatma Gandhi
>>> > > >
>>> > >
>>> >
>>>
>>
>>
> --


-- Enrico Olivelli

Reply via email to