On Mon, Jul 24, 2017 at 3:06 AM, Enrico Olivelli <[email protected]> wrote:
> 2017-07-21 20:37 GMT+02:00 Enrico Olivelli <[email protected]>: > > > > > > > Il ven 21 lug 2017, 20:32 Sijie Guo <[email protected]> ha scritto: > > > >> As the discussion in a separate thread, it might be making sense to > check > >> what is the difference between using pooled allocator and unpooled > >> allocator using v3 protocol. Also considering comparing using heap > buffer > >> and direct buffer as well. > >> > >> I am suspecting this might contribute latency. > >> > > > > Yep, I am looking in this direction too. > > I see that many frequent writes lead to an huge use of non heap memory, > > even bounding the JVM with MaxDirectMemory with max 1GB all, the 12GB of > my > > laptop blow away during the run of my benchmark. > > I suspect it is something in direct memory or something in SO caches. > > I am not very skilled in SO linux memory diagnostics > > > > > I wrote a new "write intensive" benchmark, and the only thing I have > noticed is that Linux is using as much RAM as possible for disk caches, > this is the expected behavior on Linux. > Yes, and this is good behavior. Why keep something unused? > This is not memory allocated to the process itself. > There is no difference from 4.4 and 4.5 from this aspect. > > I have tried the journalRemoveFromPageCachebut it brings no improvement. > > I did some tests as suggested by Flavio, separating the client and the > bookie (even on different machines). I can say there is no "leak" nor on > client side neither on bookie side. > > > finally during my benchmarks I noticed that the real "performance drop" > happens after this lines of log > > lug 24, 2017 12:00:56 PM org.apache.bookkeeper.bookie.EntryLogger > flushRotatedLogs > INFO: Synced entry logger 0 to disk > > I am now investigating why after the first "flushRotatedLogs" bookie is > slowing down > > Thanks for the update. Eager to learn what is the culprit. JV > > Enrico > > > > > > > > Enrico > > > > > > > >> > >> > >> > >> - Sijie > >> > >> On Thu, Jul 20, 2017 at 4:49 AM, Enrico Olivelli <[email protected]> > >> wrote: > >> > >> > Kishore, do you have news? > >> > > >> > Il ven 14 lug 2017, 09:05 Enrico Olivelli <[email protected]> ha > >> > scritto: > >> > > >> > > At the meeting we told the Kishore will perform some benchmarks on > his > >> > > side. > >> > > He will take a look at my code, and we are going to share the > results. > >> > > Maybe it will be possible to share the results of benchmarks done > from > >> > > Kishore at Salesforce too. > >> > > > >> > > The primary goal is to understand the differences between 4.4 and > 4.5, > >> > for > >> > > instance if we there is a need to change JVM/BK configuration in > >> order to > >> > > make 4.5 perform as 4.4. > >> > > > >> > > @Sijie I hope I have answered your questions. > >> > > > >> > > > >> > > -- Enrico > >> > > > >> > > > >> > > 2017-07-13 9:29 GMT+02:00 Enrico Olivelli <[email protected]>: > >> > > > >> > >> > >> > >> > >> > >> 2017-07-13 4:11 GMT+02:00 Sijie Guo <[email protected]>: > >> > >> > >> > >>> On Wed, Jul 12, 2017 at 10:35 PM, Enrico Olivelli < > >> [email protected] > >> > > > >> > >>> wrote: > >> > >>> > >> > >>> > Sijie, JV, just a recap my point of view: > >> > >>> > - considering latency = "time for asynchAddEntry to complete" > >> > >>> > - there is a some difference from 4.4 and 4.5 in the usage of > >> memory, > >> > >>> but > >> > >>> > no so clear > >> > >>> > - the type of GC (parallel vs G1) does not impact very much but > >> with > >> > >>> G1 you > >> > >>> > achieve best latency > >> > >>> > > >> > >>> > >> > >>> Did you try CMS for v3 protocol? > >> > >>> > >> > >> > >> > >> BK 4.5 + v3 procotol + CMS + throttle = 0 > >> > >> > >> > >> #0 Total wall clock time: 309.0 ms, total callbacks time: 249753 > ms, > >> > >> entry size 0,034 MB -> 249,75 ms per entry (latency),0,3 ms per > entry > >> > >> (throughput) 110,6 MB/s throughput > >> > >> #1 Total wall clock time: 192.0 ms, total callbacks time: 156268 > ms, > >> > >> entry size 0,034 MB -> 156,27 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 178,0 MB/s throughput > >> > >> #2 Total wall clock time: 129.0 ms, total callbacks time: 95279 ms, > >> > entry > >> > >> size 0,034 MB -> 95,28 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 265,0 MB/s throughput > >> > >> #3 Total wall clock time: 125.0 ms, total callbacks time: 94661 ms, > >> > entry > >> > >> size 0,034 MB -> 94,66 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 273,4 MB/s throughput > >> > >> #4 Total wall clock time: 112.0 ms, total callbacks time: 83015 ms, > >> > entry > >> > >> size 0,034 MB -> 83,02 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 305,2 MB/s throughput > >> > >> #5 Total wall clock time: 111.0 ms, total callbacks time: 82839 ms, > >> > entry > >> > >> size 0,034 MB -> 82,84 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 307,9 MB/s throughput > >> > >> #6 Total wall clock time: 114.0 ms, total callbacks time: 82562 ms, > >> > entry > >> > >> size 0,034 MB -> 82,56 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 299,8 MB/s throughput > >> > >> #7 Total wall clock time: 104.0 ms, total callbacks time: 74356 ms, > >> > entry > >> > >> size 0,034 MB -> 74,36 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 328,7 MB/s throughput > >> > >> #8 Total wall clock time: 110.0 ms, total callbacks time: 83488 ms, > >> > entry > >> > >> size 0,034 MB -> 83,49 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 310,7 MB/s throughput > >> > >> #9 Total wall clock time: 102.0 ms, total callbacks time: 68787 ms, > >> > entry > >> > >> size 0,034 MB -> 68,79 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 335,1 MB/s throughput > >> > >> > >> > >> gc.log > >> > >> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE > >> > >> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > gcc > >> > 4.3.0 > >> > >> 20080428 (Red Hat 4.3.0-8) > >> > >> Memory: 4k page, physical 32870636k(3276148k free), swap > >> > >> 15622140k(15460604k free) > >> > >> CommandLine flags: -XX:+CMSIncrementalMode > >> > -XX:InitialHeapSize=2147483648 <%28214%29%20748-3648> > >> > >> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 > <%28214%29%20748-3648> > >> <(214)%20748-3648> > >> > >> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 > >> > -XX:NewSize=697933824 > >> > >> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC > >> > >> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers > >> > >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > >> > >> 0,630: [GC (Allocation Failure) 545344K->8508K(2029056K), > 0,0065802 > >> > secs] > >> > >> 1,196: [GC (Allocation Failure) 553852K->117940K(2029056K), > >> 0,0385576 > >> > >> secs] > >> > >> 1,235: [GC (CMS Initial Mark) 119486K(2029056K), 0,0022661 secs] > >> > >> 1,544: [GC (Allocation Failure) 663284K->207833K(2029056K), > >> 0,0330904 > >> > >> secs] > >> > >> 1,731: [GC (CMS Final Remark) 560308K(2029056K), 0,0105624 secs] > >> > >> 1,837: [GC (Allocation Failure) 753177K->214147K(2029056K), > >> 0,0105574 > >> > >> secs] > >> > >> 2,115: [GC (Allocation Failure) 759491K->241263K(2029056K), > >> 0,0166134 > >> > >> secs] > >> > >> > >> > >> > >> > >> BK 4.5 + v2 protocol + CMS + throttle = 0 > >> > >> > >> > >> #0 Total wall clock time: 198.0 ms, total callbacks time: 169954 > ms, > >> > >> entry size 0,034 MB -> 169,95 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 172,6 MB/s throughput > >> > >> #1 Total wall clock time: 106.0 ms, total callbacks time: 70885 ms, > >> > entry > >> > >> size 0,034 MB -> 70,89 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 322,4 MB/s throughput > >> > >> #2 Total wall clock time: 109.0 ms, total callbacks time: 71848 ms, > >> > entry > >> > >> size 0,034 MB -> 71,85 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 313,6 MB/s throughput > >> > >> #3 Total wall clock time: 97.0 ms, total callbacks time: 73583 ms, > >> entry > >> > >> size 0,034 MB -> 73,58 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 352,4 MB/s throughput > >> > >> #4 Total wall clock time: 91.0 ms, total callbacks time: 61293 ms, > >> entry > >> > >> size 0,034 MB -> 61,29 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 375,6 MB/s throughput > >> > >> #5 Total wall clock time: 98.0 ms, total callbacks time: 73006 ms, > >> entry > >> > >> size 0,034 MB -> 73,01 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 348,8 MB/s throughput > >> > >> #6 Total wall clock time: 104.0 ms, total callbacks time: 75696 ms, > >> > entry > >> > >> size 0,034 MB -> 75,70 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 328,7 MB/s throughput > >> > >> #7 Total wall clock time: 103.0 ms, total callbacks time: 79260 ms, > >> > entry > >> > >> size 0,034 MB -> 79,26 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 331,8 MB/s throughput > >> > >> #8 Total wall clock time: 92.0 ms, total callbacks time: 62232 ms, > >> entry > >> > >> size 0,034 MB -> 62,23 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 371,5 MB/s throughput > >> > >> #9 Total wall clock time: 98.0 ms, total callbacks time: 65359 ms, > >> entry > >> > >> size 0,034 MB -> 65,36 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 348,8 MB/s throughput > >> > >> > >> > >> > >> > >> gc.log > >> > >> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE > >> > >> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > gcc > >> > 4.3.0 > >> > >> 20080428 (Red Hat 4.3.0-8) > >> > >> Memory: 4k page, physical 32870636k(3154408k free), swap > >> > >> 15622140k(15460604k free) > >> > >> CommandLine flags: -XX:+CMSIncrementalMode > >> > -XX:InitialHeapSize=2147483648 <%28214%29%20748-3648> > >> > >> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 > <%28214%29%20748-3648> > >> <(214)%20748-3648> > >> > >> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 > >> > -XX:NewSize=697933824 > >> > >> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC > >> > >> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers > >> > >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > >> > >> 0,554: [GC (Allocation Failure) 545344K->8624K(2029056K), > 0,0060763 > >> > secs] > >> > >> 1,478: [GC (Allocation Failure) 552028K->79872K(2029056K), > 0,0252399 > >> > >> secs] > >> > >> 1,504: [GC (CMS Initial Mark) 84227K(2029056K), 0,0017813 secs] > >> > >> > >> > >> > >> > >> BK 4.4 + v3protocol + CMS + throttle = 0 > >> > >> > >> > >> #0 Total wall clock time: 214.0 ms, total callbacks time: 207369 > ms, > >> > >> entry size 0,034 MB -> 207,37 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 159,7 MB/s throughput > >> > >> #1 Total wall clock time: 153.0 ms, total callbacks time: 140661 > ms, > >> > >> entry size 0,034 MB -> 140,66 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 223,4 MB/s throughput > >> > >> #2 Total wall clock time: 159.0 ms, total callbacks time: 145131 > ms, > >> > >> entry size 0,034 MB -> 145,13 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 215,0 MB/s throughput > >> > >> #3 Total wall clock time: 109.0 ms, total callbacks time: 105018 > ms, > >> > >> entry size 0,034 MB -> 105,02 ms per entry (latency),0,1 ms per > entry > >> > >> (throughput) 313,6 MB/s throughput > >> > >> #4 Total wall clock time: 114.0 ms, total callbacks time: 110111 > ms, > >> > >> entry size 0,034 MB -> 110,11 ms per entry (latency),0,1 ms per > entry > >> > >> (throughput) 299,8 MB/s throughput > >> > >> #5 Total wall clock time: 118.0 ms, total callbacks time: 111534 > ms, > >> > >> entry size 0,034 MB -> 111,53 ms per entry (latency),0,1 ms per > entry > >> > >> (throughput) 289,7 MB/s throughput > >> > >> #6 Total wall clock time: 98.0 ms, total callbacks time: 87425 ms, > >> entry > >> > >> size 0,034 MB -> 87,43 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 348,8 MB/s throughput > >> > >> #7 Total wall clock time: 135.0 ms, total callbacks time: 129583 > ms, > >> > >> entry size 0,034 MB -> 129,58 ms per entry (latency),0,1 ms per > entry > >> > >> (throughput) 253,2 MB/s throughput > >> > >> #8 Total wall clock time: 99.0 ms, total callbacks time: 84986 ms, > >> entry > >> > >> size 0,034 MB -> 84,99 ms per entry (latency),0,1 ms per entry > >> > (throughput) > >> > >> 345,2 MB/s throughput > >> > >> #9 Total wall clock time: 125.0 ms, total callbacks time: 117492 > ms, > >> > >> entry size 0,034 MB -> 117,49 ms per entry (latency),0,1 ms per > entry > >> > >> (throughput) 273,4 MB/s throughput > >> > >> > >> > >> gc.log > >> > >> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE > >> > >> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > gcc > >> > 4.3.0 > >> > >> 20080428 (Red Hat 4.3.0-8) > >> > >> Memory: 4k page, physical 32870636k(3212484k free), swap > >> > >> 15622140k(15460604k free) > >> > >> CommandLine flags: -XX:+CMSIncrementalMode > >> > -XX:InitialHeapSize=2147483648 <%28214%29%20748-3648> > >> > >> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 > <%28214%29%20748-3648> > >> <(214)%20748-3648> > >> > >> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 > >> > -XX:NewSize=697933824 > >> > >> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC > >> > >> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers > >> > >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > >> > >> 0,572: [GC (Allocation Failure) 545344K->8016K(2029056K), > 0,0057737 > >> > secs] > >> > >> 0,882: [GC (Allocation Failure) 553360K->88089K(2029056K), > 0,0205542 > >> > >> secs] > >> > >> 0,903: [GC (CMS Initial Mark) 93974K(2029056K), 0,0010542 secs] > >> > >> 1,100: [GC (Allocation Failure) 633433K->94935K(2029056K), > 0,0359948 > >> > >> secs] > >> > >> 1,221: [GC (CMS Final Remark) 384627K(2029056K), 0,0079453 secs] > >> > >> 1,313: [GC (Allocation Failure) 640279K->98309K(2029056K), > 0,0157774 > >> > >> secs] > >> > >> 1,481: [GC (Allocation Failure) 643653K->154062K(2029056K), > >> 0,0179418 > >> > >> secs] > >> > >> 1,677: [GC (Allocation Failure) 699406K->210685K(2029056K), > >> 0,0312668 > >> > >> secs] > >> > >> 1,709: [GC (CMS Initial Mark) 211186K(2029056K), 0,0006615 secs] > >> > >> 1,894: [GC (Allocation Failure) 756029K->263420K(2029056K), > >> 0,0168112 > >> > >> secs] > >> > >> > >> > >> BK 4.5 + throttle = 5000 + CMS + v3 protocol > >> > >> > >> > >> #0 Total wall clock time: 248.0 ms, total callbacks time: 102307 > ms, > >> > >> entry size 0,034 MB -> 102,31 ms per entry (latency),0,2 ms per > entry > >> > >> (throughput) 137,8 MB/s throughput > >> > >> #1 Total wall clock time: 205.0 ms, total callbacks time: 12282 ms, > >> > entry > >> > >> size 0,034 MB -> 12,28 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> #2 Total wall clock time: 204.0 ms, total callbacks time: 6920 ms, > >> entry > >> > >> size 0,034 MB -> 6,92 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #3 Total wall clock time: 221.0 ms, total callbacks time: 13589 ms, > >> > entry > >> > >> size 0,034 MB -> 13,59 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 154,7 MB/s throughput > >> > >> #4 Total wall clock time: 204.0 ms, total callbacks time: 6023 ms, > >> entry > >> > >> size 0,034 MB -> 6,02 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #5 Total wall clock time: 205.0 ms, total callbacks time: 6350 ms, > >> entry > >> > >> size 0,034 MB -> 6,35 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> #6 Total wall clock time: 205.0 ms, total callbacks time: 7849 ms, > >> entry > >> > >> size 0,034 MB -> 7,85 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> #7 Total wall clock time: 204.0 ms, total callbacks time: 6443 ms, > >> entry > >> > >> size 0,034 MB -> 6,44 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #8 Total wall clock time: 206.0 ms, total callbacks time: 7009 ms, > >> entry > >> > >> size 0,034 MB -> 7,01 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 165,9 MB/s throughput > >> > >> #9 Total wall clock time: 205.0 ms, total callbacks time: 6150 ms, > >> entry > >> > >> size 0,034 MB -> 6,15 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> > >> > >> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE > >> > >> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > gcc > >> > 4.3.0 > >> > >> 20080428 (Red Hat 4.3.0-8) > >> > >> Memory: 4k page, physical 32870636k(3247600k free), swap > >> > >> 15622140k(15460604k free) > >> > >> CommandLine flags: -XX:+CMSIncrementalMode > >> > -XX:InitialHeapSize=2147483648 <%28214%29%20748-3648> > >> > >> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 > <%28214%29%20748-3648> > >> <(214)%20748-3648> > >> > >> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 > >> > -XX:NewSize=697933824 > >> > >> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC > >> > >> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers > >> > >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > >> > >> 0,568: [GC (Allocation Failure) 545344K->8497K(2029056K), > 0,0060302 > >> > secs] > >> > >> 1,059: [GC (Allocation Failure) 553841K->112138K(2029056K), > >> 0,0206230 > >> > >> secs] > >> > >> 1,080: [GC (CMS Initial Mark) 117602K(2029056K), 0,0013282 secs] > >> > >> 1,561: [GC (Allocation Failure) 654247K->189231K(2029056K), > >> 0,0379235 > >> > >> secs] > >> > >> 2,079: [GC (Allocation Failure) 731215K->185468K(2029056K), > >> 0,0107924 > >> > >> secs] > >> > >> 2,320: [GC (CMS Final Remark) 431925K(2029056K), 0,0076070 secs] > >> > >> 2,635: [GC (Allocation Failure) 730812K->217376K(2029056K), > >> 0,0136162 > >> > >> secs] > >> > >> > >> > >> > >> > >> BK 4.4 + throttle = 5000 + CMS + v3 protocol > >> > >> > >> > >> #0 Total wall clock time: 208.0 ms, total callbacks time: 55274 ms, > >> > entry > >> > >> size 0,034 MB -> 55,27 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 164,3 MB/s throughput > >> > >> #1 Total wall clock time: 207.0 ms, total callbacks time: 11848 ms, > >> > entry > >> > >> size 0,034 MB -> 11,85 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 165,1 MB/s throughput > >> > >> #2 Total wall clock time: 208.0 ms, total callbacks time: 9442 ms, > >> entry > >> > >> size 0,034 MB -> 9,44 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 164,3 MB/s throughput > >> > >> #3 Total wall clock time: 204.0 ms, total callbacks time: 6739 ms, > >> entry > >> > >> size 0,034 MB -> 6,74 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #4 Total wall clock time: 205.0 ms, total callbacks time: 7546 ms, > >> entry > >> > >> size 0,034 MB -> 7,55 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> #5 Total wall clock time: 204.0 ms, total callbacks time: 6594 ms, > >> entry > >> > >> size 0,034 MB -> 6,59 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #6 Total wall clock time: 205.0 ms, total callbacks time: 5885 ms, > >> entry > >> > >> size 0,034 MB -> 5,89 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 166,7 MB/s throughput > >> > >> #7 Total wall clock time: 204.0 ms, total callbacks time: 7722 ms, > >> entry > >> > >> size 0,034 MB -> 7,72 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 167,5 MB/s throughput > >> > >> #8 Total wall clock time: 224.0 ms, total callbacks time: 5968 ms, > >> entry > >> > >> size 0,034 MB -> 5,97 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 152,6 MB/s throughput > >> > >> #9 Total wall clock time: 209.0 ms, total callbacks time: 6731 ms, > >> entry > >> > >> size 0,034 MB -> 6,73 ms per entry (latency),0,2 ms per entry > >> > (throughput) > >> > >> 163,5 MB/s throughput > >> > >> > >> > >> gc.log > >> > >> Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE > >> > >> (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > gcc > >> > 4.3.0 > >> > >> 20080428 (Red Hat 4.3.0-8) > >> > >> Memory: 4k page, physical 32870636k(3291888k free), swap > >> > >> 15622140k(15460604k free) > >> > >> CommandLine flags: -XX:+CMSIncrementalMode > >> > -XX:InitialHeapSize=2147483648 <%28214%29%20748-3648> > >> > >> <(214)%20748-3648> -XX:MaxHeapSize=2147483648 > <%28214%29%20748-3648> > >> <(214)%20748-3648> > >> > >> -XX:MaxNewSize=697933824 -XX:MaxTenuringThreshold=6 > >> > -XX:NewSize=697933824 > >> > >> -XX:OldPLABSize=16 -XX:OldSize=1395867648 -XX:+PrintGC > >> > >> -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers > >> > >> -XX:+UseCompressedOops -XX:+UseConcMarkSweepGC -XX:+UseParNewGC > >> > >> 0,554: [GC (Allocation Failure) 545344K->8022K(2029056K), > 0,0055210 > >> > secs] > >> > >> 0,890: [GC (Allocation Failure) 553366K->93114K(2029056K), > 0,0177003 > >> > >> secs] > >> > >> 0,909: [GC (CMS Initial Mark) 97210K(2029056K), 0,0011810 secs] > >> > >> 1,204: [GC (Allocation Failure) 638458K->102278K(2029056K), > >> 0,0181779 > >> > >> secs] > >> > >> 1,536: [GC (Allocation Failure) 645878K->103955K(2029056K), > >> 0,0088187 > >> > >> secs] > >> > >> 1,679: [GC (CMS Final Remark) 353595K(2029056K), 0,0078553 secs] > >> > >> 1,858: [GC (Allocation Failure) 649299K->78654K(2029056K), > 0,0064803 > >> > >> secs] > >> > >> 2,187: [GC (Allocation Failure) 623998K->120130K(2029056K), > >> 0,0125856 > >> > >> secs] > >> > >> 2,200: [GC (CMS Initial Mark) 121879K(2029056K), 0,0003934 secs] > >> > >> 2,518: [GC (Allocation Failure) 665474K->193813K(2029056K), > >> 0,0202701 > >> > >> secs] > >> > >> > >> > >> > >> > >> > >> > >> > >> > >>> > >> > >>> > - using the v2 protocol brings huge improvements in the usage of > >> > memory > >> > >>> > - if you want good "latency" for each entry you have to set at > >> least > >> > >>> > throttle to 1000, but the throughput will be very lower (from 30 > >> MB/s > >> > >>> vs > >> > >>> > 300 MB/s on my disk) > >> > >>> > > >> > >>> > >> > >>> What type of disk do you have? What is your filesystem mount > >> options? > >> > >>> > >> > >> > >> > >> Disk type: SSD > >> > >> this is fstab line: > >> > >> /dev/mapper/Ssd-Home /home ext4 > >> > >> defaults,discard 1 2 > >> > >> > >> > >> uname -a > >> > >> Linux DNA101PC193.diennea.lan 4.11.5-100.fc24.x86_64 #1 SMP Wed Jun > >> 14 > >> > >> 17:21:39 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux > >> > >> > >> > >> > >> > >> > >> > >>> > >> > >>> When you did the test, how did GC activities look like when > setting > >> > >>> throttle to be more than 1000? > >> > >>> > >> > >> > >> > >> I have added new results, see above > >> > >> > >> > >> > >> > >>> > >> > >>> I think the latency looked "bad" when throttle was more than 1000, > >> is > >> > >>> because of GC. Did you have that test results? > >> > >>> > >> > >> > >> > >> > >> > >> I apologize, in the below results when I did not write 'throttle = > >> 0" I > >> > >> meant throttle = 1000 > >> > >> > >> > >> Did you take a look an the code of the bench ? Maybe a double check > >> > would > >> > >> be useful, many times badly written benchmarks are not useful at > all > >> > >> > >> > >> Thank you > >> > >> -- Enrico > >> > >> > >> > >> > >> > >> > >> > >>> > >> > >>> > >> > >>> > below the results of my simple bench > >> > >>> > > >> > >>> > Enrico > >> > >>> > > >> > >>> > Bench code: > >> > >>> > > >> > >>> https://github.com/eolivelli/bookkeepers-benchs/blob/ > >> > master/src/test/java/ > >> > >>> > BookKeeperWriteTest.java > >> > >>> > > >> > >>> > Results: > >> > >>> > with 4.5.0-SNAPSHOT and v2 protocol - G1 GC > >> > >>> > > >> > >>> > #0 Total wall clock time: 1000.0 ms, total callbacks time: 8460 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 8,46 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,2 MB/s throughput > >> > >>> > #1 Total wall clock time: 1005.0 ms, total callbacks time: 4529 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,53 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4293 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1003.0 ms, total callbacks time: 4317 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4553 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #5 Total wall clock time: 1001.0 ms, total callbacks time: 4448 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #6 Total wall clock time: 1004.0 ms, total callbacks time: 4449 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #7 Total wall clock time: 1003.0 ms, total callbacks time: 4504 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4290 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #9 Total wall clock time: 1003.0 ms, total callbacks time: 4441 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(9153868k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> -XX:+UseG1GC > >> > >>> > 1,988: [GC pause (G1 Evacuation Pause) (young) 146M->59M(2048M), > >> > >>> 0,0104300 > >> > >>> > secs] > >> > >>> > > >> > >>> > > >> > >>> > Results, with 4.5.0-SNAPSHOT and v3 protocol - G1 GC > >> > >>> > #0 Total wall clock time: 1001.0 ms, total callbacks time: 11800 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 11,80 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #1 Total wall clock time: 1001.0 ms, total callbacks time: 4585 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,59 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4458 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4471 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #4 Total wall clock time: 1004.0 ms, total callbacks time: 4473 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #5 Total wall clock time: 1003.0 ms, total callbacks time: 4477 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #6 Total wall clock time: 1007.0 ms, total callbacks time: 4424 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 33,9 MB/s throughput > >> > >>> > #7 Total wall clock time: 1005.0 ms, total callbacks time: 4576 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4447 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #9 Total wall clock time: 1007.0 ms, total callbacks time: 4501 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 33,9 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(9042512k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> -XX:+UseG1GC > >> > >>> > 0,830: [GC pause (G1 Evacuation Pause) (young) 118M->33M(2048M), > >> > >>> 0,0116376 > >> > >>> > secs] > >> > >>> > 1,372: [GC pause (G1 Evacuation Pause) (young) 162M->73M(2048M), > >> > >>> 0,0055103 > >> > >>> > secs] > >> > >>> > 2,172: [GC pause (G1 Evacuation Pause) (young) 227M->99M(2048M), > >> > >>> 0,0023962 > >> > >>> > secs] > >> > >>> > 10,358: [GC pause (G1 Evacuation Pause) (young) > >> 1620M->122M(2048M), > >> > >>> > 0,0026519 secs] > >> > >>> > 11,676: [GC pause (Metadata GC Threshold) (young) (initial-mark) > >> > >>> > 242M->114M(2048M), 0,0021797 secs] > >> > >>> > 11,678: [GC concurrent-root-region-scan-start] > >> > >>> > 11,679: [GC concurrent-root-region-scan-end, 0,0007421 secs] > >> > >>> > 11,679: [GC concurrent-mark-start] > >> > >>> > 11,684: [GC concurrent-mark-end, 0,0046645 secs] > >> > >>> > 11,684: [GC remark, 0,0027234 secs] > >> > >>> > 11,687: [GC cleanup 115M->111M(2048M), 0,0008957 secs] > >> > >>> > 11,688: [GC concurrent-cleanup-start] > >> > >>> > 11,688: [GC concurrent-cleanup-end, 0,0000070 secs] > >> > >>> > > >> > >>> > > >> > >>> > With 4.4.0 - v3 protocol - G1 GC > >> > >>> > #0 Total wall clock time: 1001.0 ms, total callbacks time: 6723 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 6,72 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4495 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #2 Total wall clock time: 1001.0 ms, total callbacks time: 4406 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,41 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1005.0 ms, total callbacks time: 4457 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4388 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #5 Total wall clock time: 1004.0 ms, total callbacks time: 4451 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #6 Total wall clock time: 1003.0 ms, total callbacks time: 4574 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,57 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #7 Total wall clock time: 1003.0 ms, total callbacks time: 4523 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,52 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #8 Total wall clock time: 1004.0 ms, total callbacks time: 4450 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #9 Total wall clock time: 1002.0 ms, total callbacks time: 4542 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,54 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8931792k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> -XX:+UseG1GC > >> > >>> > 0,750: [GC pause (G1 Evacuation Pause) (young) 106M->14M(2048M), > >> > >>> 0,0070948 > >> > >>> > secs] > >> > >>> > 1,089: [GC pause (G1 Evacuation Pause) (young) 127M->21M(2048M), > >> > >>> 0,0038837 > >> > >>> > secs] > >> > >>> > 1,608: [GC pause (G1 Evacuation Pause) (young) 193M->41M(2048M), > >> > >>> 0,0039903 > >> > >>> > secs] > >> > >>> > 3,745: [GC pause (G1 Evacuation Pause) (young) 733M->49M(2048M), > >> > >>> 0,0046991 > >> > >>> > secs] > >> > >>> > 7,990: [GC pause (G1 Evacuation Pause) (young) > 1423M->65M(2048M), > >> > >>> 0,0052844 > >> > >>> > secs] > >> > >>> > > >> > >>> > > >> > >>> > > >> > >>> > with 4.4 - v3 protocol Parallel GC > >> > >>> > > >> > >>> > #0 Total wall clock time: 999.0 ms, total callbacks time: 6007 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 6,01 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,2 MB/s throughput > >> > >>> > #1 Total wall clock time: 1001.0 ms, total callbacks time: 4813 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,81 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #2 Total wall clock time: 1003.0 ms, total callbacks time: 4580 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4349 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #4 Total wall clock time: 1002.0 ms, total callbacks time: 4456 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #5 Total wall clock time: 1006.0 ms, total callbacks time: 4504 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #6 Total wall clock time: 1003.0 ms, total callbacks time: 4387 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #7 Total wall clock time: 1002.0 ms, total callbacks time: 4643 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,64 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #8 Total wall clock time: 1002.0 ms, total callbacks time: 4548 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #9 Total wall clock time: 1001.0 ms, total callbacks time: 4633 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,63 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8877776k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> > >>> -XX:+UseParallelGC > >> > >>> > 0,603: [GC (Allocation Failure) 524800K->8488K(2010112K), > >> 0,0056746 > >> > >>> secs] > >> > >>> > 1,996: [GC (Allocation Failure) 533268K->181534K(2010112K), > >> > 0,0248460 > >> > >>> > secs] > >> > >>> > 3,475: [GC (Allocation Failure) 703337K->237612K(2010112K), > >> > 0,0192195 > >> > >>> > secs] > >> > >>> > 4,963: [GC (Allocation Failure) 762412K->212913K(2010112K), > >> > 0,0085357 > >> > >>> > secs] > >> > >>> > 6,473: [GC (Allocation Failure) 737684K->192597K(1951232K), > >> > 0,0065437 > >> > >>> > secs] > >> > >>> > 8,019: [GC (Allocation Failure) 704597K->319617K(1863680K), > >> > 0,0189275 > >> > >>> > secs] > >> > >>> > 9,008: [GC (Allocation Failure) 651905K->315463K(1853952K), > >> > 0,0121369 > >> > >>> > secs] > >> > >>> > 9,997: [GC (Allocation Failure) 647729K->418132K(1848832K), > >> > 0,0222260 > >> > >>> > secs] > >> > >>> > 10,742: [GC (Allocation Failure) 679764K->340097K(1878528K), > >> > 0,0111347 > >> > >>> > secs] > >> > >>> > > >> > >>> > > >> > >>> > with 4.5 - v2 protocol Parallel GC > >> > >>> > > >> > >>> > #0 Total wall clock time: 1004.0 ms, total callbacks time: 8030 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 8,03 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4394 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #2 Total wall clock time: 1002.0 ms, total callbacks time: 4417 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1005.0 ms, total callbacks time: 4362 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,36 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #4 Total wall clock time: 1003.0 ms, total callbacks time: 4435 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #5 Total wall clock time: 1002.0 ms, total callbacks time: 4489 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #7 Total wall clock time: 1004.0 ms, total callbacks time: 4474 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #8 Total wall clock time: 1005.0 ms, total callbacks time: 4436 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #9 Total wall clock time: 1001.0 ms, total callbacks time: 4322 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8951980k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> > >>> -XX:+UseParallelGC > >> > >>> > 0,583: [GC (Allocation Failure) 524800K->8921K(2010112K), > >> 0,0054470 > >> > >>> secs] > >> > >>> > 6,397: [GC (Allocation Failure) 533721K->178956K(2010112K), > >> > 0,0306493 > >> > >>> secs > >> > >>> > > >> > >>> > > >> > >>> > with 4.5 - v3 protocol Parallel GC > >> > >>> > > >> > >>> > #0 Total wall clock time: 1002.0 ms, total callbacks time: 10679 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 10,68 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #1 Total wall clock time: 1003.0 ms, total callbacks time: 4802 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,80 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #2 Total wall clock time: 1003.0 ms, total callbacks time: 4479 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 1002.0 ms, total callbacks time: 4726 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,73 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #4 Total wall clock time: 1007.0 ms, total callbacks time: 4393 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 33,9 MB/s throughput > >> > >>> > #5 Total wall clock time: 1006.0 ms, total callbacks time: 4619 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,62 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #7 Total wall clock time: 1002.0 ms, total callbacks time: 4402 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,40 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,1 MB/s throughput > >> > >>> > #8 Total wall clock time: 1004.0 ms, total callbacks time: 4487 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 34,0 MB/s throughput > >> > >>> > #9 Total wall clock time: 1007.0 ms, total callbacks time: 4315 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry > >> > >>> (throughput) > >> > >>> > 33,9 MB/s throughput > >> > >>> > > >> > >>> > gc.log > >> > >>> > > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8921388k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> > >>> -XX:+UseParallelGC > >> > >>> > 0,563: [GC (Allocation Failure) 524800K->8935K(2010112K), > >> 0,0055378 > >> > >>> secs] > >> > >>> > 1,936: [GC (Allocation Failure) 533735K->187494K(2010112K), > >> > 0,0281195 > >> > >>> > secs] > >> > >>> > 3,917: [GC (Allocation Failure) 712294K->275491K(2010112K), > >> > 0,0298168 > >> > >>> > secs] > >> > >>> > 6,385: [GC (Allocation Failure) 798141K->394430K(2010112K), > >> > 0,0279730 > >> > >>> > secs] > >> > >>> > 8,934: [GC (Allocation Failure) 919230K->394530K(2008576K), > >> > 0,0127794 > >> > >>> > secs] > >> > >>> > 11,634: [GC (Metadata GC Threshold) 811154K->348997K(1868288K), > >> > >>> 0,0080161 > >> > >>> > secs] > >> > >>> > 11,642: [Full GC (Metadata GC Threshold) > >> 348997K->39874K(1868288K), > >> > >>> > 0,0247462 secs] > >> > >>> > > >> > >>> > > >> > >>> > 4.5 v3 protocol parallel GC throttle = 0 > >> > >>> > > >> > >>> > #0 Total wall clock time: 334.0 ms, total callbacks time: 267444 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 267,44 ms per entry (latency),0,3 ms per entry > >> > >>> > (throughput) 102,3 MB/s throughput > >> > >>> > #1 Total wall clock time: 189.0 ms, total callbacks time: 154428 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 154,43 ms per entry (latency),0,2 ms per entry > >> > >>> > (throughput) 180,8 MB/s throughput > >> > >>> > #2 Total wall clock time: 136.0 ms, total callbacks time: 103012 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 103,01 ms per entry (latency),0,1 ms per entry > >> > >>> > (throughput) 251,3 MB/s throughput > >> > >>> > #3 Total wall clock time: 156.0 ms, total callbacks time: 113315 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 113,32 ms per entry (latency),0,2 ms per entry > >> > >>> > (throughput) 219,1 MB/s throughput > >> > >>> > #4 Total wall clock time: 110.0 ms, total callbacks time: 79296 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 79,30 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 310,7 MB/s throughput > >> > >>> > #5 Total wall clock time: 104.0 ms, total callbacks time: 71213 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 71,21 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 328,7 MB/s throughput > >> > >>> > #6 Total wall clock time: 161.0 ms, total callbacks time: 72447 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 72,45 ms per entry (latency),0,2 ms per entry > >> > >>> (throughput) > >> > >>> > 212,3 MB/s throughput > >> > >>> > #7 Total wall clock time: 103.0 ms, total callbacks time: 82783 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 82,78 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 331,8 MB/s throughput > >> > >>> > #8 Total wall clock time: 138.0 ms, total callbacks time: 106086 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 106,09 ms per entry (latency),0,1 ms per entry > >> > >>> > (throughput) 247,7 MB/s throughput > >> > >>> > #9 Total wall clock time: 118.0 ms, total callbacks time: 77750 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 77,75 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 289,7 MB/s throughput > >> > >>> > > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8858832k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> > >>> -XX:+UseParallelGC > >> > >>> > 0,660: [GC (Allocation Failure) 524800K->8955K(2010112K), > >> 0,0061442 > >> > >>> secs] > >> > >>> > 1,288: [GC (Allocation Failure) 533726K->191538K(2010112K), > >> > 0,0329663 > >> > >>> > secs] > >> > >>> > 1,634: [GC (Allocation Failure) 716338K->349468K(2010112K), > >> > 0,0508922 > >> > >>> > secs] > >> > >>> > 1,942: [GC (Allocation Failure) 874268K->671717K(2010112K), > >> > 0,0609749 > >> > >>> > secs] > >> > >>> > 2,266: [GC (Allocation Failure) 1196517K->781335K(2010112K), > >> > 0,0320305 > >> > >>> > secs] > >> > >>> > > >> > >>> > 4.5 v2protocol throttle = 0 G1 GC > >> > >>> > > >> > >>> > #0 Total wall clock time: 200.0 ms, total callbacks time: 173682 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 173,68 ms per entry (latency),0,2 ms per entry > >> > >>> > (throughput) 170,9 MB/s throughput > >> > >>> > #1 Total wall clock time: 128.0 ms, total callbacks time: 89987 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 89,99 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 267,0 MB/s throughput > >> > >>> > #2 Total wall clock time: 117.0 ms, total callbacks time: 86411 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 86,41 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 292,1 MB/s throughput > >> > >>> > #3 Total wall clock time: 95.0 ms, total callbacks time: 64054 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 64,05 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 359,8 MB/s throughput > >> > >>> > #4 Total wall clock time: 92.0 ms, total callbacks time: 60863 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 60,86 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 371,5 MB/s throughput > >> > >>> > #5 Total wall clock time: 92.0 ms, total callbacks time: 62026 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 62,03 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 371,5 MB/s throughput > >> > >>> > #6 Total wall clock time: 93.0 ms, total callbacks time: 65033 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 65,03 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 367,5 MB/s throughput > >> > >>> > #7 Total wall clock time: 88.0 ms, total callbacks time: 59851 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 59,85 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 388,4 MB/s throughput > >> > >>> > #8 Total wall clock time: 93.0 ms, total callbacks time: 61816 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 61,82 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 367,5 MB/s throughput > >> > >>> > #9 Total wall clock time: 93.0 ms, total callbacks time: 70397 > ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 70,40 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 367,5 MB/s throughput > >> > >>> > > >> > >>> > > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8894828k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> -XX:+UseG1GC > >> > >>> > 1,078: [GC pause (G1 Evacuation Pause) (young) 150M->65M(2048M), > >> > >>> 0,0138100 > >> > >>> > secs] > >> > >>> > > >> > >>> > bk 4.5 v3 protocol G1 GC throttle = 0 > >> > >>> > > >> > >>> > #0 Total wall clock time: 275.0 ms, total callbacks time: 238043 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 238,04 ms per entry (latency),0,3 ms per entry > >> > >>> > (throughput) 124,3 MB/s throughput > >> > >>> > #1 Total wall clock time: 148.0 ms, total callbacks time: 106877 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 106,88 ms per entry (latency),0,1 ms per entry > >> > >>> > (throughput) 230,9 MB/s throughput > >> > >>> > #2 Total wall clock time: 130.0 ms, total callbacks time: 96448 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 96,45 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 262,9 MB/s throughput > >> > >>> > #3 Total wall clock time: 118.0 ms, total callbacks time: 78246 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 78,25 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 289,7 MB/s throughput > >> > >>> > #4 Total wall clock time: 109.0 ms, total callbacks time: 79110 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 79,11 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 313,6 MB/s throughput > >> > >>> > #5 Total wall clock time: 124.0 ms, total callbacks time: 86652 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 86,65 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 275,6 MB/s throughput > >> > >>> > #6 Total wall clock time: 110.0 ms, total callbacks time: 76729 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 76,73 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 310,7 MB/s throughput > >> > >>> > #7 Total wall clock time: 110.0 ms, total callbacks time: 80104 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 80,10 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 310,7 MB/s throughput > >> > >>> > #8 Total wall clock time: 104.0 ms, total callbacks time: 73514 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 73,51 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 328,7 MB/s throughput > >> > >>> > #9 Total wall clock time: 102.0 ms, total callbacks time: 74487 > >> ms, > >> > >>> entry > >> > >>> > size 0,034 MB -> 74,49 ms per entry (latency),0,1 ms per entry > >> > >>> (throughput) > >> > >>> > 335,1 MB/s throughput > >> > >>> > > >> > >>> > Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 > JRE > >> > >>> > (1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with > >> gcc > >> > >>> 4.3.0 > >> > >>> > 20080428 (Red Hat 4.3.0-8) > >> > >>> > Memory: 4k page, physical 32870636k(8958604k free), swap > >> > >>> > 15622140k(15464700k free) > >> > >>> > CommandLine flags: -XX:InitialHeapSize=2147483648 > >> <%28214%29%20748-3648> <(214)%20748-3648> > >> > >>> > -XX:MaxHeapSize=2147483648 <%28214%29%20748-3648> > >> <(214)%20748-3648> -XX:+PrintGC > >> > >>> -XX:+PrintGCTimeStamps > >> > >>> > -XX:+UseCompressedClassPointers -XX:+UseCompressedOops > >> -XX:+UseG1GC > >> > >>> > 0,876: [GC pause (G1 Evacuation Pause) (young) 102M->19M(2048M), > >> > >>> 0,0107143 > >> > >>> > secs] > >> > >>> > 0,989: [GC pause (G1 Evacuation Pause) (young) 152M->81M(2048M), > >> > >>> 0,0074561 > >> > >>> > secs] > >> > >>> > 1,125: [GC pause (G1 Evacuation Pause) (young) > 198M->108M(2048M), > >> > >>> 0,0038296 > >> > >>> > secs] > >> > >>> > 1,286: [GC pause (G1 Evacuation Pause) (young) > 306M->146M(2048M), > >> > >>> 0,0034297 > >> > >>> > secs] > >> > >>> > > >> > >>> > > >> > >>> > > >> > >>> > 2017-07-12 3:07 GMT+02:00 Sijie Guo <[email protected]>: > >> > >>> > > >> > >>> > > On Wed, Jul 12, 2017 at 2:04 AM, Venkateswara Rao Jujjuri < > >> > >>> > > [email protected] > >> > >>> > > > wrote: > >> > >>> > > > >> > >>> > > > Enrico, let me try to paraphrase the issue. > >> > >>> > > > > >> > >>> > > > - With G1GC + Netty 4.1 is giving you roughly same perf as > >> prev > >> > >>> > release. > >> > >>> > > Is > >> > >>> > > > that accurate statement? > >> > >>> > > > But you are still seeing latency spikes with Netty 4.1?? > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > I did not fully > >> > >>> > > > understand your sleep usecase. > >> > >>> > > > How sleep is yielding better latency? > >> > >>> > > > > >> > >>> > > > >> > >>> > > I think the "sleep(1)" in Enrico's benchmark is for > throttling. > >> It > >> > is > >> > >>> > same > >> > >>> > > as setting 'throttle' value to 1000 in his test. > >> > >>> > > > >> > >>> > > - Sijie > >> > >>> > > > >> > >>> > > > >> > >>> > > > >> > >>> > > > > >> > >>> > > > Thanks, > >> > >>> > > > JV > >> > >>> > > > > >> > >>> > > > On Tue, Jul 11, 2017 at 8:27 AM, Enrico Olivelli < > >> > >>> [email protected]> > >> > >>> > > > wrote: > >> > >>> > > > > >> > >>> > > > > Another interesting thing...during my profiling activity I > >> > gave a > >> > >>> > > chance > >> > >>> > > > to > >> > >>> > > > > the old v2 protocol and activated the gc logs, as expected > >> the > >> > >>> result > >> > >>> > > is > >> > >>> > > > > that with v2 protocol there is almost no GC activity > during > >> the > >> > >>> > > benchmark > >> > >>> > > > > -- Enrico > >> > >>> > > > > > >> > >>> > > > > 2017-07-11 12:07 GMT+02:00 Enrico Olivelli < > >> > [email protected] > >> > >>> >: > >> > >>> > > > > > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > 2017-07-11 11:04 GMT+02:00 Sijie Guo < > [email protected] > >> >: > >> > >>> > > > > > > >> > >>> > > > > >> I think Netty4 requires more offheap memory. you might > >> need > >> > to > >> > >>> > tune > >> > >>> > > > the > >> > >>> > > > > >> JVM > >> > >>> > > > > >> settings. I doubt that latency diff coming from the JVM > >> gc. > >> > >>> > > > > >> > >> > >>> > > > > >> A simple thing to verify that is to dump the gc log by > >> > adding > >> > >>> " > >> > >>> > > > > -Xloggc:" > >> > >>> > > > > >> setting and compare the gc logs between versions. > >> > >>> > > > > >> > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > Enabling G1 GC (-XX:+UseG1GC) apparently solves the > >> > difference > >> > >>> > > between > >> > >>> > > > > 4.4 > >> > >>> > > > > > and 4.5. > >> > >>> > > > > > > >> > >>> > > > > > And I a loop the bench (in order to have a warm JVM) > >> numbers > >> > >>> become > >> > >>> > > > more > >> > >>> > > > > > stable and similar to each other > >> > >>> > > > > > > >> > >>> > > > > > I have the "other issue" pending, the fact that latency > >> (time > >> > >>> for > >> > >>> > > > > > asynchAddEntry to complete) is really bad and > >> unpredictable > >> > >>> (from > >> > >>> > 100 > >> > >>> > > > ms > >> > >>> > > > > to > >> > >>> > > > > > 2000ms) > >> > >>> > > > > > > >> > >>> > > > > > and if I introduce a Thread.sleep(1) all the callbacks > >> > complete > >> > >>> > > always > >> > >>> > > > > > with success in < 10ms. This happens even with G1 GC. > >> > >>> > > > > > > >> > >>> > > > > > Without the "sleep(1)" my machine uses a lot of CPU (I > >> have 8 > >> > >>> > "CPUs") > >> > >>> > > > and > >> > >>> > > > > > with the "sleep(1)" the load is slightly lower > >> > >>> > > > > > > >> > >>> > > > > > Honestly for me this is still a problem and I hope that > >> with > >> > >>> your > >> > >>> > > help > >> > >>> > > > I > >> > >>> > > > > > will be able to find the problem, wherever it is (in BK > >> code > >> > >>> or in > >> > >>> > > the > >> > >>> > > > > way > >> > >>> > > > > > I am doing the bench) > >> > >>> > > > > > > >> > >>> > > > > > I will try to create a new more complete bench > >> > >>> > > > > > > >> > >>> > > > > > -- Enrico > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > > > >> > >>> > > > > >> > >> > >>> > > > > >> - Sijie > >> > >>> > > > > >> > >> > >>> > > > > >> On Tue, Jul 11, 2017 at 12:16 AM, Enrico Olivelli < > >> > >>> > > > [email protected]> > >> > >>> > > > > >> wrote: > >> > >>> > > > > >> > >> > >>> > > > > >> > a did a bisect and the culprit (in my opinion) is the > >> > >>> switch to > >> > >>> > > > netty > >> > >>> > > > > 4 > >> > >>> > > > > >> for > >> > >>> > > > > >> > the performance regression from 4.5 and 4.4 > >> > >>> > > > > >> > > >> > >>> > > > > >> > at commit: > >> > >>> > > > > >> > commit 811ece53a1c975c4e768422f3d622ac9de6b3e41 > >> > >>> > > BOOKKEEPER-1058: > >> > >>> > > > > >> Ignore > >> > >>> > > > > >> > already deleted ledger on replication audit > >> > >>> > > > > >> > > >> > >>> > > > > >> > Total time: 204 ms > >> > >>> > > > > >> > Total real time: 79 ms per entry > >> > >>> > > > > >> > > >> > >>> > > > > >> > at commit: > >> > >>> > > > > >> > commit 74f795136c1fff3badb29fc982d0cc2d43096b45 > >> > >>> > BOOKKEEPER-1008: > >> > >>> > > > > Netty > >> > >>> > > > > >> 4.1 > >> > >>> > > > > >> > > >> > >>> > > > > >> > Total time: 308 ms > >> > >>> > > > > >> > Total real time: 189 ms per entry > >> > >>> > > > > >> > > >> > >>> > > > > >> > I have tried with epoll and with local transport, > >> results > >> > >>> does > >> > >>> > not > >> > >>> > > > > >> change. > >> > >>> > > > > >> > I tried to upgrade to netty 4.1.13 too, but no change > >> > >>> > > > > >> > > >> > >>> > > > > >> > Could it be the memory allocator of netty which is > >> > >>> overwhelmed > >> > >>> > > with > >> > >>> > > > > >> sudden > >> > >>> > > > > >> > bursts of allocation ? > >> > >>> > > > > >> > I did some trial with UnpooledByteBufAllocator. > DEFAULT > >> > and > >> > >>> it > >> > >>> > > > helps a > >> > >>> > > > > >> > little, we get to "110 ms per entry" vs "189 ms per > >> entry" > >> > >>> > > > > >> > > >> > >>> > > > > >> > the bench is here: > >> > >>> > > > > >> > > >> > >>> https://github.com/eolivelli/bookkeepers-benchs/blob/master/ > >> > >>> > > > > >> src/test/java/ > >> > >>> > > > > >> > BookKeeperWriteTest.java > >> > >>> > > > > >> > > >> > >>> > > > > >> > > >> > >>> > > > > >> > -- Enrico > >> > >>> > > > > >> > > >> > >>> > > > > >> > > >> > >>> > > > > >> > > >> > >>> > > > > >> > 2017-07-10 19:46 GMT+02:00 Enrico Olivelli < > >> > >>> [email protected] > >> > >>> > >: > >> > >>> > > > > >> > > >> > >>> > > > > >> > > > >> > >>> > > > > >> > > > >> > >>> > > > > >> > > Il lun 10 lug 2017, 18:21 Venkateswara Rao Jujjuri > < > >> > >>> > > > > [email protected] > >> > >>> > > > > >> > > >> > >>> > > > > >> > ha > >> > >>> > > > > >> > > scritto: > >> > >>> > > > > >> > > > >> > >>> > > > > >> > >> With Netty changes, lack of native epoll() has > huge > >> > perf > >> > >>> > impact > >> > >>> > > > as > >> > >>> > > > > >> per > >> > >>> > > > > >> > >> Kishore. > >> > >>> > > > > >> > >> Are you sure you are using epoll()? > >> > >>> > > > > >> > >> > >> > >>> > > > > >> > > > >> > >>> > > > > >> > > Yes. I tried with netty local transport too. It > seems > >> > not > >> > >>> > > related > >> > >>> > > > to > >> > >>> > > > > >> > netty > >> > >>> > > > > >> > > to me. > >> > >>> > > > > >> > > I will double check, tomorrow > >> > >>> > > > > >> > > Enrico > >> > >>> > > > > >> > > > >> > >>> > > > > >> > > > >> > >>> > > > > >> > >> On Mon, Jul 10, 2017 at 1:49 AM, Enrico Olivelli < > >> > >>> > > > > >> [email protected]> > >> > >>> > > > > >> > >> wrote: > >> > >>> > > > > >> > >> > >> > >>> > > > > >> > >> > 2017-07-10 10:40 GMT+02:00 Sijie Guo < > >> > >>> [email protected]>: > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > > Also one other thing to check is the JVM > >> settings. > >> > >>> Do you > >> > >>> > > > mind > >> > >>> > > > > >> > sharing > >> > >>> > > > > >> > >> > that > >> > >>> > > > > >> > >> > > as well? > >> > >>> > > > > >> > >> > > > >> > >>> > > > > >> > >> > > > >> > >>> > > > > >> > >> > this is the surefire config, I am using oracle > >> jdk 8 > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > <plugin> > >> > >>> > > > > >> > >> > <artifactId>maven-surefire- > >> > >>> > > plugin</artifactId> > >> > >>> > > > > >> > >> > <version>2.20</version> > >> > >>> > > > > >> > >> > <configuration> > >> > >>> > > > > >> > >> > <forkCount>1</forkCount> > >> > >>> > > > > >> > >> > <reuseForks>false</ > >> reuseForks> > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > <forkedProcessTimeoutInSeconds>300</ > >> > >>> > > > > forkedProcessTimeoutInSeconds> > >> > >>> > > > > >> > >> > <argLine>-Xmx2G > >> > >>> > > > > >> > >> > -Djava.io.tmpdir=${basedir}/target</argLine> > >> > >>> > > > > >> > >> > </configuration> > >> > >>> > > > > >> > >> > </plugin> > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > -- Enrico > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > > >> > >>> > > > > >> > >> > > Sijie > >> > >>> > > > > >> > >> > > > >> > >>> > > > > >> > >> > > On Jul 10, 2017 1:17 AM, "Sijie Guo" < > >> > >>> [email protected] > >> > >>> > > > >> > >>> > > > > wrote: > >> > >>> > > > > >> > >> > > > >> > >>> > > > > >> > >> > > > I am not sure if there is any default values > >> > >>> changed > >> > >>> > for > >> > >>> > > > > >> journal > >> > >>> > > > > >> > >> > > settings. > >> > >>> > > > > >> > >> > > > I would suggest you testing by setting > >> > >>> specifically the > >> > >>> > > > > journal > >> > >>> > > > > >> > >> > settings. > >> > >>> > > > > >> > >> > > > > >> > >>> > > > > >> > >> > > > Also if you can share your benchmark, that > >> would > >> > be > >> > >>> > good > >> > >>> > > > for > >> > >>> > > > > >> other > >> > >>> > > > > >> > >> > people > >> > >>> > > > > >> > >> > > > to verify. > >> > >>> > > > > >> > >> > > > > >> > >>> > > > > >> > >> > > > Sijie > >> > >>> > > > > >> > >> > > > > >> > >>> > > > > >> > >> > > > On Jul 10, 2017 12:32 AM, "Enrico Olivelli" > < > >> > >>> > > > > >> [email protected]> > >> > >>> > > > > >> > >> > wrote: > >> > >>> > > > > >> > >> > > > > >> > >>> > > > > >> > >> > > >> Hi, > >> > >>> > > > > >> > >> > > >> I am doing some benchmarks on BK, I see > that > >> > from > >> > >>> > 4.4.0 > >> > >>> > > to > >> > >>> > > > > >> 4.5.0 > >> > >>> > > > > >> > >> there > >> > >>> > > > > >> > >> > > is > >> > >>> > > > > >> > >> > > >> something "slow" but I cannot understand > >> what. I > >> > >>> > really > >> > >>> > > > hope > >> > >>> > > > > >> that > >> > >>> > > > > >> > >> I am > >> > >>> > > > > >> > >> > > >> wrong. > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> I am working with writes, I will pass to > >> reads > >> > >>> once > >> > >>> > > writes > >> > >>> > > > > >> will > >> > >>> > > > > >> > be > >> > >>> > > > > >> > >> ok. > >> > >>> > > > > >> > >> > > >> My problem is both on latency (time for > >> > >>> AddComplete > >> > >>> > > > callback > >> > >>> > > > > >> to > >> > >>> > > > > >> > >> > > complete) > >> > >>> > > > > >> > >> > > >> and on overall throuput. > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> Actually I have two distinct problems, but > >> > >>> working on > >> > >>> > > the > >> > >>> > > > > >> first > >> > >>> > > > > >> > >> > problem > >> > >>> > > > > >> > >> > > I > >> > >>> > > > > >> > >> > > >> found a performance regression. > >> > >>> > > > > >> > >> > > >> I know that talking about "slow" things it > >> is an > >> > >>> hard > >> > >>> > > > > matter, > >> > >>> > > > > >> so > >> > >>> > > > > >> > I > >> > >>> > > > > >> > >> > will > >> > >>> > > > > >> > >> > > >> try > >> > >>> > > > > >> > >> > > >> do describe as much as possible all the > >> aspects > >> > >>> that I > >> > >>> > > > think > >> > >>> > > > > >> are > >> > >>> > > > > >> > >> > > relevant. > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> First problem: under certain load > performance > >> > >>> > > > > >> > (latency+throughput) > >> > >>> > > > > >> > >> > > degrade > >> > >>> > > > > >> > >> > > >> too much > >> > >>> > > > > >> > >> > > >> Second problem: the first problem is more > >> > evident > >> > >>> in > >> > >>> > > 4.5.0 > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> Let's describe my testcase and why I am > >> worried. > >> > >>> > > > > >> > >> > > >> The bench issues a batch of asyncAddEntry > and > >> > >>> prints > >> > >>> > the > >> > >>> > > > > >> average > >> > >>> > > > > >> > >> time > >> > >>> > > > > >> > >> > > for > >> > >>> > > > > >> > >> > > >> AddComplete to complete and the overall > clock > >> > >>> time. > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> This is the code > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> private static final byte[] TEST_DATA = new > >> > >>> byte[35 * > >> > >>> > > > 1024]; > >> > >>> > > > > >> > >> > > >> private static final int testsize = 1000; > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> ...... (start 1 bookie, see below) > >> > >>> > > > > >> > >> > > >> ClientConfiguration > >> > >>> clientConfiguration = > >> > >>> > > new > >> > >>> > > > > >> > >> > > >> ClientConfiguration(); > >> > >>> > > > > >> > >> > > >> clientConfiguration.setZkServ > >> > >>> > > > > >> ers(env.getAddress()); > >> > >>> > > > > >> > >> > > >> try (BookKeeper bk = new > >> BookKeeper( > >> > >>> > > > > >> > >> clientConfiguration); > >> > >>> > > > > >> > >> > > >> LedgerHandle lh = > >> > >>> bk.createLedger(1, > >> > >>> > 1, > >> > >>> > > 1, > >> > >>> > > > > >> > >> > > >> BookKeeper.DigestType.CRC32, new byte[0])) > { > >> > >>> > > > > >> > >> > > >> LongAdder totalTime = new > >> > >>> LongAdder(); > >> > >>> > > > > >> > >> > > >> long _start = > >> > >>> > > System.currentTimeMillis(); > >> > >>> > > > > >> > >> > > >> > Collection<CompletableFuture> > >> > >>> batch = > >> > >>> > > new > >> > >>> > > > > >> > >> > > >> ConcurrentLinkedQueue<>(); > >> > >>> > > > > >> > >> > > >> for (int i = 0; i < > testsize; > >> > >>> i++) { > >> > >>> > > > > >> > >> > > >> CompletableFuture cf = > >> new > >> > >>> > > > > >> > CompletableFuture(); > >> > >>> > > > > >> > >> > > >> batch.add(cf); > >> > >>> > > > > >> > >> > > >> > >> lh.asyncAddEntry(TEST_DATA, > >> > >>> new > >> > >>> > > > > >> > >> > > >> AsyncCallback.AddCallback() { > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> long start = > >> > >>> > > > > >> System.currentTimeMillis(); > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> @Override > >> > >>> > > > > >> > >> > > >> public void > >> > >>> addComplete(int > >> > >>> > rc, > >> > >>> > > > > >> > >> LedgerHandle > >> > >>> > > > > >> > >> > lh, > >> > >>> > > > > >> > >> > > >> long entryId, Object ctx) { > >> > >>> > > > > >> > >> > > >> long now = > >> > >>> > > > > >> > >> > > >> System.currentTimeMillis(); > >> > >>> > > > > >> > >> > > >> > CompletableFuture > >> > _cf > >> > >>> = > >> > >>> > > > > >> > >> > (CompletableFuture) > >> > >>> > > > > >> > >> > > >> ctx; > >> > >>> > > > > >> > >> > > >> if (rc == > >> > >>> > > > BKException.Code.OK) { > >> > >>> > > > > >> > >> > > >> > >> > _cf.complete(""); > >> > >>> > > > > >> > >> > > >> } else { > >> > >>> > > > > >> > >> > > >> > >> > >>> > > > > >> > >> > > >> _cf.completeExceptionally( > >> > BKException.create(rc)); > >> > >>> > > > > >> > >> > > >> } > >> > >>> > > > > >> > >> > > >> > >> totalTime.add(now - > >> > >>> > start); > >> > >>> > > > > >> > >> > > >> > > > > -- > > > > > > -- Enrico Olivelli > > > -- Jvrao --- First they ignore you, then they laugh at you, then they fight you, then you win. - Mahatma Gandhi
