Sijie, JV, just a recap my point of view:
- considering latency = "time for asynchAddEntry to complete"
- there is a some difference from 4.4 and 4.5 in the usage of memory, but
no so clear
- the type of GC (parallel vs G1) does not impact very much but with G1 you
achieve best latency
- using the v2 protocol brings huge improvements in the usage of memory
- if you want good "latency" for each entry you have to set at least
throttle to 1000, but the throughput will be very lower (from 30 MB/s vs
300 MB/s on my disk)

below the results of my simple bench

Enrico

Bench code:
https://github.com/eolivelli/bookkeepers-benchs/blob/master/src/test/java/BookKeeperWriteTest.java

Results:
with 4.5.0-SNAPSHOT and v2 protocol - G1 GC

#0 Total wall clock time: 1000.0 ms, total callbacks time: 8460 ms, entry
size 0,034 MB -> 8,46 ms per entry (latency),1,0 ms per entry (throughput)
34,2 MB/s throughput
#1 Total wall clock time: 1005.0 ms, total callbacks time: 4529 ms, entry
size 0,034 MB -> 4,53 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#2 Total wall clock time: 1002.0 ms, total callbacks time: 4293 ms, entry
size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1003.0 ms, total callbacks time: 4317 ms, entry
size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#4 Total wall clock time: 1002.0 ms, total callbacks time: 4553 ms, entry
size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#5 Total wall clock time: 1001.0 ms, total callbacks time: 4448 ms, entry
size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#6 Total wall clock time: 1004.0 ms, total callbacks time: 4449 ms, entry
size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#7 Total wall clock time: 1003.0 ms, total callbacks time: 4504 ms, entry
size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#8 Total wall clock time: 1002.0 ms, total callbacks time: 4290 ms, entry
size 0,034 MB -> 4,29 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#9 Total wall clock time: 1003.0 ms, total callbacks time: 4441 ms, entry
size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput

gc.log
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(9153868k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
1,988: [GC pause (G1 Evacuation Pause) (young) 146M->59M(2048M), 0,0104300
secs]


Results, with 4.5.0-SNAPSHOT and v3 protocol - G1 GC
#0 Total wall clock time: 1001.0 ms, total callbacks time: 11800 ms, entry
size 0,034 MB -> 11,80 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#1 Total wall clock time: 1001.0 ms, total callbacks time: 4585 ms, entry
size 0,034 MB -> 4,59 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#2 Total wall clock time: 1002.0 ms, total callbacks time: 4458 ms, entry
size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1002.0 ms, total callbacks time: 4471 ms, entry
size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#4 Total wall clock time: 1004.0 ms, total callbacks time: 4473 ms, entry
size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#5 Total wall clock time: 1003.0 ms, total callbacks time: 4477 ms, entry
size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#6 Total wall clock time: 1007.0 ms, total callbacks time: 4424 ms, entry
size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry (throughput)
33,9 MB/s throughput
#7 Total wall clock time: 1005.0 ms, total callbacks time: 4576 ms, entry
size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#8 Total wall clock time: 1002.0 ms, total callbacks time: 4447 ms, entry
size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#9 Total wall clock time: 1007.0 ms, total callbacks time: 4501 ms, entry
size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry (throughput)
33,9 MB/s throughput

gc.log
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(9042512k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
0,830: [GC pause (G1 Evacuation Pause) (young) 118M->33M(2048M), 0,0116376
secs]
1,372: [GC pause (G1 Evacuation Pause) (young) 162M->73M(2048M), 0,0055103
secs]
2,172: [GC pause (G1 Evacuation Pause) (young) 227M->99M(2048M), 0,0023962
secs]
10,358: [GC pause (G1 Evacuation Pause) (young) 1620M->122M(2048M),
0,0026519 secs]
11,676: [GC pause (Metadata GC Threshold) (young) (initial-mark)
242M->114M(2048M), 0,0021797 secs]
11,678: [GC concurrent-root-region-scan-start]
11,679: [GC concurrent-root-region-scan-end, 0,0007421 secs]
11,679: [GC concurrent-mark-start]
11,684: [GC concurrent-mark-end, 0,0046645 secs]
11,684: [GC remark, 0,0027234 secs]
11,687: [GC cleanup 115M->111M(2048M), 0,0008957 secs]
11,688: [GC concurrent-cleanup-start]
11,688: [GC concurrent-cleanup-end, 0,0000070 secs]


With 4.4.0 - v3 protocol - G1 GC
#0 Total wall clock time: 1001.0 ms, total callbacks time: 6723 ms, entry
size 0,034 MB -> 6,72 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#1 Total wall clock time: 1003.0 ms, total callbacks time: 4495 ms, entry
size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#2 Total wall clock time: 1001.0 ms, total callbacks time: 4406 ms, entry
size 0,034 MB -> 4,41 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1005.0 ms, total callbacks time: 4457 ms, entry
size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#4 Total wall clock time: 1002.0 ms, total callbacks time: 4388 ms, entry
size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#5 Total wall clock time: 1004.0 ms, total callbacks time: 4451 ms, entry
size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#6 Total wall clock time: 1003.0 ms, total callbacks time: 4574 ms, entry
size 0,034 MB -> 4,57 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#7 Total wall clock time: 1003.0 ms, total callbacks time: 4523 ms, entry
size 0,034 MB -> 4,52 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#8 Total wall clock time: 1004.0 ms, total callbacks time: 4450 ms, entry
size 0,034 MB -> 4,45 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#9 Total wall clock time: 1002.0 ms, total callbacks time: 4542 ms, entry
size 0,034 MB -> 4,54 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput

gc.log
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8931792k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
0,750: [GC pause (G1 Evacuation Pause) (young) 106M->14M(2048M), 0,0070948
secs]
1,089: [GC pause (G1 Evacuation Pause) (young) 127M->21M(2048M), 0,0038837
secs]
1,608: [GC pause (G1 Evacuation Pause) (young) 193M->41M(2048M), 0,0039903
secs]
3,745: [GC pause (G1 Evacuation Pause) (young) 733M->49M(2048M), 0,0046991
secs]
7,990: [GC pause (G1 Evacuation Pause) (young) 1423M->65M(2048M), 0,0052844
secs]



with 4.4 - v3 protocol Parallel GC

#0 Total wall clock time: 999.0 ms, total callbacks time: 6007 ms, entry
size 0,034 MB -> 6,01 ms per entry (latency),1,0 ms per entry (throughput)
34,2 MB/s throughput
#1 Total wall clock time: 1001.0 ms, total callbacks time: 4813 ms, entry
size 0,034 MB -> 4,81 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#2 Total wall clock time: 1003.0 ms, total callbacks time: 4580 ms, entry
size 0,034 MB -> 4,58 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1002.0 ms, total callbacks time: 4349 ms, entry
size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#4 Total wall clock time: 1002.0 ms, total callbacks time: 4456 ms, entry
size 0,034 MB -> 4,46 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#5 Total wall clock time: 1006.0 ms, total callbacks time: 4504 ms, entry
size 0,034 MB -> 4,50 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#6 Total wall clock time: 1003.0 ms, total callbacks time: 4387 ms, entry
size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#7 Total wall clock time: 1002.0 ms, total callbacks time: 4643 ms, entry
size 0,034 MB -> 4,64 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#8 Total wall clock time: 1002.0 ms, total callbacks time: 4548 ms, entry
size 0,034 MB -> 4,55 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#9 Total wall clock time: 1001.0 ms, total callbacks time: 4633 ms, entry
size 0,034 MB -> 4,63 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput

gc.log
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8877776k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
0,603: [GC (Allocation Failure)  524800K->8488K(2010112K), 0,0056746 secs]
1,996: [GC (Allocation Failure)  533268K->181534K(2010112K), 0,0248460 secs]
3,475: [GC (Allocation Failure)  703337K->237612K(2010112K), 0,0192195 secs]
4,963: [GC (Allocation Failure)  762412K->212913K(2010112K), 0,0085357 secs]
6,473: [GC (Allocation Failure)  737684K->192597K(1951232K), 0,0065437 secs]
8,019: [GC (Allocation Failure)  704597K->319617K(1863680K), 0,0189275 secs]
9,008: [GC (Allocation Failure)  651905K->315463K(1853952K), 0,0121369 secs]
9,997: [GC (Allocation Failure)  647729K->418132K(1848832K), 0,0222260 secs]
10,742: [GC (Allocation Failure)  679764K->340097K(1878528K), 0,0111347
secs]


with 4.5 - v2 protocol Parallel GC

#0 Total wall clock time: 1004.0 ms, total callbacks time: 8030 ms, entry
size 0,034 MB -> 8,03 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#1 Total wall clock time: 1003.0 ms, total callbacks time: 4394 ms, entry
size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#2 Total wall clock time: 1002.0 ms, total callbacks time: 4417 ms, entry
size 0,034 MB -> 4,42 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1005.0 ms, total callbacks time: 4362 ms, entry
size 0,034 MB -> 4,36 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#4 Total wall clock time: 1003.0 ms, total callbacks time: 4435 ms, entry
size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#5 Total wall clock time: 1002.0 ms, total callbacks time: 4489 ms, entry
size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 ms, entry
size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#7 Total wall clock time: 1004.0 ms, total callbacks time: 4474 ms, entry
size 0,034 MB -> 4,47 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#8 Total wall clock time: 1005.0 ms, total callbacks time: 4436 ms, entry
size 0,034 MB -> 4,44 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#9 Total wall clock time: 1001.0 ms, total callbacks time: 4322 ms, entry
size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput

gc.log
Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8951980k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
0,583: [GC (Allocation Failure)  524800K->8921K(2010112K), 0,0054470 secs]
6,397: [GC (Allocation Failure)  533721K->178956K(2010112K), 0,0306493 secs


with 4.5 - v3 protocol Parallel GC

#0 Total wall clock time: 1002.0 ms, total callbacks time: 10679 ms, entry
size 0,034 MB -> 10,68 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#1 Total wall clock time: 1003.0 ms, total callbacks time: 4802 ms, entry
size 0,034 MB -> 4,80 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#2 Total wall clock time: 1003.0 ms, total callbacks time: 4479 ms, entry
size 0,034 MB -> 4,48 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#3 Total wall clock time: 1002.0 ms, total callbacks time: 4726 ms, entry
size 0,034 MB -> 4,73 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#4 Total wall clock time: 1007.0 ms, total callbacks time: 4393 ms, entry
size 0,034 MB -> 4,39 ms per entry (latency),1,0 ms per entry (throughput)
33,9 MB/s throughput
#5 Total wall clock time: 1006.0 ms, total callbacks time: 4619 ms, entry
size 0,034 MB -> 4,62 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#6 Total wall clock time: 1002.0 ms, total callbacks time: 4346 ms, entry
size 0,034 MB -> 4,35 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#7 Total wall clock time: 1002.0 ms, total callbacks time: 4402 ms, entry
size 0,034 MB -> 4,40 ms per entry (latency),1,0 ms per entry (throughput)
34,1 MB/s throughput
#8 Total wall clock time: 1004.0 ms, total callbacks time: 4487 ms, entry
size 0,034 MB -> 4,49 ms per entry (latency),1,0 ms per entry (throughput)
34,0 MB/s throughput
#9 Total wall clock time: 1007.0 ms, total callbacks time: 4315 ms, entry
size 0,034 MB -> 4,32 ms per entry (latency),1,0 ms per entry (throughput)
33,9 MB/s throughput

gc.log

Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8921388k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
0,563: [GC (Allocation Failure)  524800K->8935K(2010112K), 0,0055378 secs]
1,936: [GC (Allocation Failure)  533735K->187494K(2010112K), 0,0281195 secs]
3,917: [GC (Allocation Failure)  712294K->275491K(2010112K), 0,0298168 secs]
6,385: [GC (Allocation Failure)  798141K->394430K(2010112K), 0,0279730 secs]
8,934: [GC (Allocation Failure)  919230K->394530K(2008576K), 0,0127794 secs]
11,634: [GC (Metadata GC Threshold)  811154K->348997K(1868288K), 0,0080161
secs]
11,642: [Full GC (Metadata GC Threshold)  348997K->39874K(1868288K),
0,0247462 secs]


4.5 v3 protocol parallel GC throttle = 0

#0 Total wall clock time: 334.0 ms, total callbacks time: 267444 ms, entry
size 0,034 MB -> 267,44 ms per entry (latency),0,3 ms per entry
(throughput) 102,3 MB/s throughput
#1 Total wall clock time: 189.0 ms, total callbacks time: 154428 ms, entry
size 0,034 MB -> 154,43 ms per entry (latency),0,2 ms per entry
(throughput) 180,8 MB/s throughput
#2 Total wall clock time: 136.0 ms, total callbacks time: 103012 ms, entry
size 0,034 MB -> 103,01 ms per entry (latency),0,1 ms per entry
(throughput) 251,3 MB/s throughput
#3 Total wall clock time: 156.0 ms, total callbacks time: 113315 ms, entry
size 0,034 MB -> 113,32 ms per entry (latency),0,2 ms per entry
(throughput) 219,1 MB/s throughput
#4 Total wall clock time: 110.0 ms, total callbacks time: 79296 ms, entry
size 0,034 MB -> 79,30 ms per entry (latency),0,1 ms per entry (throughput)
310,7 MB/s throughput
#5 Total wall clock time: 104.0 ms, total callbacks time: 71213 ms, entry
size 0,034 MB -> 71,21 ms per entry (latency),0,1 ms per entry (throughput)
328,7 MB/s throughput
#6 Total wall clock time: 161.0 ms, total callbacks time: 72447 ms, entry
size 0,034 MB -> 72,45 ms per entry (latency),0,2 ms per entry (throughput)
212,3 MB/s throughput
#7 Total wall clock time: 103.0 ms, total callbacks time: 82783 ms, entry
size 0,034 MB -> 82,78 ms per entry (latency),0,1 ms per entry (throughput)
331,8 MB/s throughput
#8 Total wall clock time: 138.0 ms, total callbacks time: 106086 ms, entry
size 0,034 MB -> 106,09 ms per entry (latency),0,1 ms per entry
(throughput) 247,7 MB/s throughput
#9 Total wall clock time: 118.0 ms, total callbacks time: 77750 ms, entry
size 0,034 MB -> 77,75 ms per entry (latency),0,1 ms per entry (throughput)
289,7 MB/s throughput

Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8858832k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseParallelGC
0,660: [GC (Allocation Failure)  524800K->8955K(2010112K), 0,0061442 secs]
1,288: [GC (Allocation Failure)  533726K->191538K(2010112K), 0,0329663 secs]
1,634: [GC (Allocation Failure)  716338K->349468K(2010112K), 0,0508922 secs]
1,942: [GC (Allocation Failure)  874268K->671717K(2010112K), 0,0609749 secs]
2,266: [GC (Allocation Failure)  1196517K->781335K(2010112K), 0,0320305
secs]

4.5 v2protocol throttle = 0 G1 GC

#0 Total wall clock time: 200.0 ms, total callbacks time: 173682 ms, entry
size 0,034 MB -> 173,68 ms per entry (latency),0,2 ms per entry
(throughput) 170,9 MB/s throughput
#1 Total wall clock time: 128.0 ms, total callbacks time: 89987 ms, entry
size 0,034 MB -> 89,99 ms per entry (latency),0,1 ms per entry (throughput)
267,0 MB/s throughput
#2 Total wall clock time: 117.0 ms, total callbacks time: 86411 ms, entry
size 0,034 MB -> 86,41 ms per entry (latency),0,1 ms per entry (throughput)
292,1 MB/s throughput
#3 Total wall clock time: 95.0 ms, total callbacks time: 64054 ms, entry
size 0,034 MB -> 64,05 ms per entry (latency),0,1 ms per entry (throughput)
359,8 MB/s throughput
#4 Total wall clock time: 92.0 ms, total callbacks time: 60863 ms, entry
size 0,034 MB -> 60,86 ms per entry (latency),0,1 ms per entry (throughput)
371,5 MB/s throughput
#5 Total wall clock time: 92.0 ms, total callbacks time: 62026 ms, entry
size 0,034 MB -> 62,03 ms per entry (latency),0,1 ms per entry (throughput)
371,5 MB/s throughput
#6 Total wall clock time: 93.0 ms, total callbacks time: 65033 ms, entry
size 0,034 MB -> 65,03 ms per entry (latency),0,1 ms per entry (throughput)
367,5 MB/s throughput
#7 Total wall clock time: 88.0 ms, total callbacks time: 59851 ms, entry
size 0,034 MB -> 59,85 ms per entry (latency),0,1 ms per entry (throughput)
388,4 MB/s throughput
#8 Total wall clock time: 93.0 ms, total callbacks time: 61816 ms, entry
size 0,034 MB -> 61,82 ms per entry (latency),0,1 ms per entry (throughput)
367,5 MB/s throughput
#9 Total wall clock time: 93.0 ms, total callbacks time: 70397 ms, entry
size 0,034 MB -> 70,40 ms per entry (latency),0,1 ms per entry (throughput)
367,5 MB/s throughput


Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8894828k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
1,078: [GC pause (G1 Evacuation Pause) (young) 150M->65M(2048M), 0,0138100
secs]

bk 4.5 v3 protocol G1 GC throttle = 0

#0 Total wall clock time: 275.0 ms, total callbacks time: 238043 ms, entry
size 0,034 MB -> 238,04 ms per entry (latency),0,3 ms per entry
(throughput) 124,3 MB/s throughput
#1 Total wall clock time: 148.0 ms, total callbacks time: 106877 ms, entry
size 0,034 MB -> 106,88 ms per entry (latency),0,1 ms per entry
(throughput) 230,9 MB/s throughput
#2 Total wall clock time: 130.0 ms, total callbacks time: 96448 ms, entry
size 0,034 MB -> 96,45 ms per entry (latency),0,1 ms per entry (throughput)
262,9 MB/s throughput
#3 Total wall clock time: 118.0 ms, total callbacks time: 78246 ms, entry
size 0,034 MB -> 78,25 ms per entry (latency),0,1 ms per entry (throughput)
289,7 MB/s throughput
#4 Total wall clock time: 109.0 ms, total callbacks time: 79110 ms, entry
size 0,034 MB -> 79,11 ms per entry (latency),0,1 ms per entry (throughput)
313,6 MB/s throughput
#5 Total wall clock time: 124.0 ms, total callbacks time: 86652 ms, entry
size 0,034 MB -> 86,65 ms per entry (latency),0,1 ms per entry (throughput)
275,6 MB/s throughput
#6 Total wall clock time: 110.0 ms, total callbacks time: 76729 ms, entry
size 0,034 MB -> 76,73 ms per entry (latency),0,1 ms per entry (throughput)
310,7 MB/s throughput
#7 Total wall clock time: 110.0 ms, total callbacks time: 80104 ms, entry
size 0,034 MB -> 80,10 ms per entry (latency),0,1 ms per entry (throughput)
310,7 MB/s throughput
#8 Total wall clock time: 104.0 ms, total callbacks time: 73514 ms, entry
size 0,034 MB -> 73,51 ms per entry (latency),0,1 ms per entry (throughput)
328,7 MB/s throughput
#9 Total wall clock time: 102.0 ms, total callbacks time: 74487 ms, entry
size 0,034 MB -> 74,49 ms per entry (latency),0,1 ms per entry (throughput)
335,1 MB/s throughput

Java HotSpot(TM) 64-Bit Server VM (25.121-b13) for linux-amd64 JRE
(1.8.0_121-b13), built on Dec 12 2016 16:36:53 by "java_re" with gcc 4.3.0
20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32870636k(8958604k free), swap
15622140k(15464700k free)
CommandLine flags: -XX:InitialHeapSize=2147483648
-XX:MaxHeapSize=2147483648 -XX:+PrintGC -XX:+PrintGCTimeStamps
-XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:+UseG1GC
0,876: [GC pause (G1 Evacuation Pause) (young) 102M->19M(2048M), 0,0107143
secs]
0,989: [GC pause (G1 Evacuation Pause) (young) 152M->81M(2048M), 0,0074561
secs]
1,125: [GC pause (G1 Evacuation Pause) (young) 198M->108M(2048M), 0,0038296
secs]
1,286: [GC pause (G1 Evacuation Pause) (young) 306M->146M(2048M), 0,0034297
secs]



2017-07-12 3:07 GMT+02:00 Sijie Guo <[email protected]>:

> On Wed, Jul 12, 2017 at 2:04 AM, Venkateswara Rao Jujjuri <
> [email protected]
> > wrote:
>
> > Enrico, let me try to paraphrase the issue.
> >
> > - With G1GC + Netty 4.1 is giving you roughly same perf as prev release.
> Is
> > that accurate statement?
> > But you are still seeing latency spikes with Netty 4.1??
>
>
>
>
>
> > I did not fully
> > understand your sleep usecase.
> > How sleep is yielding better latency?
> >
>
> I think the "sleep(1)" in Enrico's benchmark is for throttling. It is same
> as setting 'throttle' value to 1000 in his test.
>
> - Sijie
>
>
>
> >
> > Thanks,
> > JV
> >
> > On Tue, Jul 11, 2017 at 8:27 AM, Enrico Olivelli <[email protected]>
> > wrote:
> >
> > > Another interesting thing...during my profiling activity I gave a
> chance
> > to
> > > the old v2 protocol and activated the gc logs, as expected the result
> is
> > > that with v2 protocol there is almost no GC activity during the
> benchmark
> > > -- Enrico
> > >
> > > 2017-07-11 12:07 GMT+02:00 Enrico Olivelli <[email protected]>:
> > >
> > > >
> > > >
> > > > 2017-07-11 11:04 GMT+02:00 Sijie Guo <[email protected]>:
> > > >
> > > >> I think Netty4 requires more offheap memory. you might need to tune
> > the
> > > >> JVM
> > > >> settings. I doubt that latency diff coming from the JVM gc.
> > > >>
> > > >> A simple thing to verify that is to dump the gc log by adding "
> > > -Xloggc:"
> > > >> setting and compare the gc logs between versions.
> > > >>
> > > >
> > > >
> > > > Enabling G1 GC (-XX:+UseG1GC) apparently solves the difference
> between
> > > 4.4
> > > > and 4.5.
> > > >
> > > > And I a loop the bench (in order to have a warm JVM) numbers become
> > more
> > > > stable and similar to each other
> > > >
> > > > I have the "other issue" pending, the fact that latency (time for
> > > > asynchAddEntry to complete) is really bad and unpredictable (from 100
> > ms
> > > to
> > > > 2000ms)
> > > >
> > > > and if I introduce a Thread.sleep(1) all the callbacks complete
> always
> > > > with success in < 10ms. This happens even with G1 GC.
> > > >
> > > > Without the "sleep(1)" my machine uses a lot of CPU (I have 8 "CPUs")
> > and
> > > > with the "sleep(1)" the load is slightly lower
> > > >
> > > > Honestly for me this is still a problem and I hope that with your
> help
> > I
> > > > will be able to find the problem, wherever it is (in BK code or in
> the
> > > way
> > > > I am doing the bench)
> > > >
> > > > I will try to create a new more complete bench
> > > >
> > > > -- Enrico
> > > >
> > > >
> > > >
> > > >
> > > >>
> > > >> - Sijie
> > > >>
> > > >> On Tue, Jul 11, 2017 at 12:16 AM, Enrico Olivelli <
> > [email protected]>
> > > >> wrote:
> > > >>
> > > >> > a did a bisect and the culprit (in my opinion) is the switch to
> > netty
> > > 4
> > > >> for
> > > >> > the performance regression from 4.5 and 4.4
> > > >> >
> > > >> > at commit:
> > > >> > commit 811ece53a1c975c4e768422f3d622ac9de6b3e41
> BOOKKEEPER-1058:
> > > >> Ignore
> > > >> > already deleted ledger on replication audit
> > > >> >
> > > >> > Total time: 204 ms
> > > >> > Total real time: 79 ms per entry
> > > >> >
> > > >> > at commit:
> > > >> > commit 74f795136c1fff3badb29fc982d0cc2d43096b45 BOOKKEEPER-1008:
> > > Netty
> > > >> 4.1
> > > >> >
> > > >> > Total time: 308 ms
> > > >> > Total real time: 189 ms per entry
> > > >> >
> > > >> > I have tried with epoll and with local transport, results does not
> > > >> change.
> > > >> > I tried to upgrade to netty 4.1.13 too, but no change
> > > >> >
> > > >> > Could it be  the memory allocator of netty which is overwhelmed
> with
> > > >> sudden
> > > >> > bursts of allocation ?
> > > >> > I did some trial with UnpooledByteBufAllocator.DEFAULT and it
> > helps a
> > > >> > little, we get to "110 ms per entry" vs "189 ms per entry"
> > > >> >
> > > >> > the bench is here:
> > > >> > https://github.com/eolivelli/bookkeepers-benchs/blob/master/
> > > >> src/test/java/
> > > >> > BookKeeperWriteTest.java
> > > >> >
> > > >> >
> > > >> > -- Enrico
> > > >> >
> > > >> >
> > > >> >
> > > >> > 2017-07-10 19:46 GMT+02:00 Enrico Olivelli <[email protected]>:
> > > >> >
> > > >> > >
> > > >> > >
> > > >> > > Il lun 10 lug 2017, 18:21 Venkateswara Rao Jujjuri <
> > > [email protected]
> > > >> >
> > > >> > ha
> > > >> > > scritto:
> > > >> > >
> > > >> > >> With Netty changes, lack of native epoll() has huge perf impact
> > as
> > > >> per
> > > >> > >> Kishore.
> > > >> > >> Are you sure you are using epoll()?
> > > >> > >>
> > > >> > >
> > > >> > > Yes. I tried with netty local transport too. It seems not
> related
> > to
> > > >> > netty
> > > >> > > to me.
> > > >> > > I will double check, tomorrow
> > > >> > > Enrico
> > > >> > >
> > > >> > >
> > > >> > >> On Mon, Jul 10, 2017 at 1:49 AM, Enrico Olivelli <
> > > >> [email protected]>
> > > >> > >> wrote:
> > > >> > >>
> > > >> > >> > 2017-07-10 10:40 GMT+02:00 Sijie Guo <[email protected]>:
> > > >> > >> >
> > > >> > >> > > Also one other thing to check is the JVM settings. Do you
> > mind
> > > >> > sharing
> > > >> > >> > that
> > > >> > >> > > as well?
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > this is the surefire config, I am using oracle jdk 8
> > > >> > >> >
> > > >> > >> >              <plugin>
> > > >> > >> >                 <artifactId>maven-surefire-
> plugin</artifactId>
> > > >> > >> >                 <version>2.20</version>
> > > >> > >> >                 <configuration>
> > > >> > >> >                     <forkCount>1</forkCount>
> > > >> > >> >                     <reuseForks>false</reuseForks>
> > > >> > >> >
> > > >> > >> > <forkedProcessTimeoutInSeconds>300</
> > > forkedProcessTimeoutInSeconds>
> > > >> > >> >                     <argLine>-Xmx2G
> > > >> > >> > -Djava.io.tmpdir=${basedir}/target</argLine>
> > > >> > >> >                 </configuration>
> > > >> > >> >             </plugin>
> > > >> > >> >
> > > >> > >> > -- Enrico
> > > >> > >> >
> > > >> > >> >
> > > >> > >> >
> > > >> > >> > > Sijie
> > > >> > >> > >
> > > >> > >> > > On Jul 10, 2017 1:17 AM, "Sijie Guo" <[email protected]>
> > > wrote:
> > > >> > >> > >
> > > >> > >> > > > I am not sure if there is any default values changed for
> > > >> journal
> > > >> > >> > > settings.
> > > >> > >> > > > I would suggest you testing by setting specifically the
> > > journal
> > > >> > >> > settings.
> > > >> > >> > > >
> > > >> > >> > > > Also if you can share your benchmark, that would be good
> > for
> > > >> other
> > > >> > >> > people
> > > >> > >> > > > to verify.
> > > >> > >> > > >
> > > >> > >> > > > Sijie
> > > >> > >> > > >
> > > >> > >> > > > On Jul 10, 2017 12:32 AM, "Enrico Olivelli" <
> > > >> [email protected]>
> > > >> > >> > wrote:
> > > >> > >> > > >
> > > >> > >> > > >> Hi,
> > > >> > >> > > >> I am doing some benchmarks on BK, I see that from 4.4.0
> to
> > > >> 4.5.0
> > > >> > >> there
> > > >> > >> > > is
> > > >> > >> > > >> something "slow" but I cannot understand what. I really
> > hope
> > > >> that
> > > >> > >> I am
> > > >> > >> > > >> wrong.
> > > >> > >> > > >>
> > > >> > >> > > >> I am working with writes, I will pass to reads once
> writes
> > > >> will
> > > >> > be
> > > >> > >> ok.
> > > >> > >> > > >> My problem is both on latency (time for AddComplete
> > callback
> > > >> to
> > > >> > >> > > complete)
> > > >> > >> > > >> and on overall throuput.
> > > >> > >> > > >>
> > > >> > >> > > >> Actually I have two distinct problems, but working on
> the
> > > >> first
> > > >> > >> > problem
> > > >> > >> > > I
> > > >> > >> > > >> found a performance regression.
> > > >> > >> > > >> I know that talking about "slow" things it is an hard
> > > matter,
> > > >> so
> > > >> > I
> > > >> > >> > will
> > > >> > >> > > >> try
> > > >> > >> > > >> do describe as much as possible all the aspects that I
> > think
> > > >> are
> > > >> > >> > > relevant.
> > > >> > >> > > >>
> > > >> > >> > > >> First problem: under certain load performance
> > > >> > (latency+throughput)
> > > >> > >> > > degrade
> > > >> > >> > > >> too much
> > > >> > >> > > >> Second problem: the first problem is more evident in
> 4.5.0
> > > >> > >> > > >>
> > > >> > >> > > >> Let's describe my testcase and why I am worried.
> > > >> > >> > > >> The bench issues a batch of asyncAddEntry and prints the
> > > >> average
> > > >> > >> time
> > > >> > >> > > for
> > > >> > >> > > >> AddComplete to complete and the overall clock time.
> > > >> > >> > > >>
> > > >> > >> > > >> This is the code
> > > >> > >> > > >>
> > > >> > >> > > >> private static final byte[] TEST_DATA = new byte[35 *
> > 1024];
> > > >> > >> > > >> private static final int testsize = 1000;
> > > >> > >> > > >>
> > > >> > >> > > >> ...... (start 1 bookie, see below)
> > > >> > >> > > >>             ClientConfiguration clientConfiguration =
> new
> > > >> > >> > > >> ClientConfiguration();
> > > >> > >> > > >>             clientConfiguration.setZkServ
> > > >> ers(env.getAddress());
> > > >> > >> > > >>             try (BookKeeper bk = new BookKeeper(
> > > >> > >> clientConfiguration);
> > > >> > >> > > >>                 LedgerHandle lh = bk.createLedger(1, 1,
> 1,
> > > >> > >> > > >> BookKeeper.DigestType.CRC32, new byte[0])) {
> > > >> > >> > > >>                 LongAdder totalTime = new LongAdder();
> > > >> > >> > > >>                 long _start =
> System.currentTimeMillis();
> > > >> > >> > > >>                 Collection<CompletableFuture> batch =
> new
> > > >> > >> > > >> ConcurrentLinkedQueue<>();
> > > >> > >> > > >>                 for (int i = 0; i < testsize; i++) {
> > > >> > >> > > >>                     CompletableFuture cf = new
> > > >> > CompletableFuture();
> > > >> > >> > > >>                     batch.add(cf);
> > > >> > >> > > >>                     lh.asyncAddEntry(TEST_DATA, new
> > > >> > >> > > >> AsyncCallback.AddCallback() {
> > > >> > >> > > >>
> > > >> > >> > > >>                         long start =
> > > >> System.currentTimeMillis();
> > > >> > >> > > >>
> > > >> > >> > > >>                         @Override
> > > >> > >> > > >>                         public void addComplete(int rc,
> > > >> > >> LedgerHandle
> > > >> > >> > lh,
> > > >> > >> > > >> long entryId, Object ctx) {
> > > >> > >> > > >>                             long now =
> > > >> > >> > > >> System.currentTimeMillis();
> > > >> > >> > > >>                             CompletableFuture _cf =
> > > >> > >> > (CompletableFuture)
> > > >> > >> > > >> ctx;
> > > >> > >> > > >>                             if (rc ==
> > BKException.Code.OK) {
> > > >> > >> > > >>                                 _cf.complete("");
> > > >> > >> > > >>                             } else {
> > > >> > >> > > >>
> > > >> > >> > > >> _cf.completeExceptionally(BKException.create(rc));
> > > >> > >> > > >>                             }
> > > >> > >> > > >>                             totalTime.add(now - start);
> > > >> > >> > > >>                         }
> > > >> > >> > > >>                     }, cf);
> > > >> > >> > > >> //                    Thread.sleep(1);      // this is
> the
> > > >> > >> tirgger!!!
> > > >> > >> > > >>                 }
> > > >> > >> > > >>                 assertEquals(testsize, batch.size());
> > > >> > >> > > >>                 for (CompletableFuture f : batch) {
> > > >> > >> > > >>                     f.get();
> > > >> > >> > > >>                 }
> > > >> > >> > > >>                 long _stop = System.currentTimeMillis();
> > > >> > >> > > >>                 long delta = _stop - _start;
> > > >> > >> > > >>                 System.out.println("Total time: " +
> delta
> > +
> > > "
> > > >> > ms");
> > > >> > >> > > >>                 System.out.println("Total real time: " +
> > > >> > >> > > totalTime.sum() +
> > > >> > >> > > >> " ms -> "+(totalTime.sum()/testsize)+" ms per entry");
> > > >> > >> > > >>             }
> > > >> > >> > > >>
> > > >> > >> > > >> Bookie config:
> > > >> > >> > > >>         ServerConfiguration conf = new
> > > ServerConfiguration();
> > > >> > >> > > >>         conf.setBookiePort(5621);
> > > >> > >> > > >>         conf.setUseHostNameAsBookieID(true);
> > > >> > >> > > >>
> > > >> > >> > > >>         Path targetDir = path.resolve("bookie_data");
> > > >> > >> > > >>         conf.setZkServers("localhost:1282");
> > > >> > >> > > >>         conf.setLedgerDirNames(new
> > > >> > >> > > >> String[]{targetDir.toAbsolutePath().toString()});
> > > >> > >> > > >>         conf.setJournalDirName(
> > targetDir.toAbsolutePath().
> > > >> > >> > toString());
> > > >> > >> > > >>         conf.setFlushInterval(1000);
> > > >> > >> > > >>         conf.setJournalFlushWhenQueueEmpty(true);
> > > >> > >> > > >>         conf.setProperty("journalMaxGroupWaitMSec", 0);
> > > >> > >> > > >>         conf.setProperty("
> journalBufferedWritesThreshold
> > ",
> > > >> > 1024);
> > > >> > >> > > >>         conf.setAutoRecoveryDaemonEnabled(false);
> > > >> > >> > > >>         conf.setEnableLocalTransport(true);
> > > >> > >> > > >>         conf.setAllowLoopback(true);
> > > >> > >> > > >>
> > > >> > >> > > >> The tests starts one ZK server + 1 Bookie + the testcase
> > in
> > > a
> > > >> > JUnit
> > > >> > >> > test
> > > >> > >> > > >>
> > > >> > >> > > >>
> > > >> > >> > > >> Results:
> > > >> > >> > > >> A - BK-4.4.0:
> > > >> > >> > > >> Total time: 209 ms
> > > >> > >> > > >> Total real time: 194337 ms -> 194 ms per entry
> > > >> > >> > > >>
> > > >> > >> > > >> B - BK-4.5.0-SNAPSHOT:
> > > >> > >> > > >> Total time: 269 ms
> > > >> > >> > > >> Total real time: 239918 ms -> 239 ms per entry
> > > >> > >> > > >>
> > > >> > >> > > >> C - BK-4.4,0 with sleep(1):
> > > >> > >> > > >> Total time: 1113 ms (1000 ms sleep time)
> > > >> > >> > > >> Total real time: 4238 ms  -> 4 ms per entry
> > > >> > >> > > >>
> > > >> > >> > > >> D - BK-4.5,0-SNAPSHOT with sleep(1):
> > > >> > >> > > >> Total time: 1121 ms (1000 ms sleep time)
> > > >> > >> > > >> Total real time: 8018 ms -> 8 ms per entry
> > > >> > >> > > >>
> > > >> > >> > > >> Problem 1 (unexpected performance degradation):
> > > >> > >> > > >> Times per entry (latency) are incredibly slow in cases A
> > and
> > > >> B.
> > > >> > >> > > >> If I add a sleep(1) between one call of asyncAddEntry
> and
> > > the
> > > >> > next
> > > >> > >> > > >> "latency" is around 4 ms per entry.
> > > >> > >> > > >>
> > > >> > >> > > >> Problem 2: worse performance on 4.5.0
> > > >> > >> > > >> Compare A vs B and C vs D, it is self-explaining.
> > > >> > >> > > >>
> > > >> > >> > > >> I am running the test on my laptop, with linux 64bit
> > > >> (Fedora), 12
> > > >> > >> GB
> > > >> > >> > > RAM,
> > > >> > >> > > >> no swap, on an SSD disk. The results are similar on
> other
> > > >> > >> computers.
> > > >> > >> > > >>
> > > >> > >> > > >> It seems that if I issue too many addEntry the systems
> > slows
> > > >> > down.
> > > >> > >> > > >>
> > > >> > >> > > >> Please note this fact:
> > > >> > >> > > >> numbers for case A and B (without sleep) mean that all
> the
> > > >> adds
> > > >> > got
> > > >> > >> > > >> completed almost together
> > > >> > >> > > >>
> > > >> > >> > > >> for the 4.5 vs 4.4 case:
> > > >> > >> > > >> I tried to disable all of the threadpool enhancements
> > > >> (different
> > > >> > >> > > >> read/write
> > > >> > >> > > >> pools)....it makes not difference
> > > >> > >> > > >>
> > > >> > >> > > >> Questions:
> > > >> > >> > > >>
> > > >> > >> > > >> Is the "grouping" logic of the journal ?
> > > >> > >> > > >>
> > > >> > >> > > >> Is there a way of making a burst of 1000 async writes on
> > the
> > > >> same
> > > >> > >> > ledger
> > > >> > >> > > >> perform <10 ms latency ?  (in my real case I have bursts
> > of
> > > >> > >> concurrent
> > > >> > >> > > >> writes from different threads)
> > > >> > >> > > >>
> > > >> > >> > > >> Why 4.5.0 is anyway slower ?
> > > >> > >> > > >>
> > > >> > >> > > >> Thanks
> > > >> > >> > > >>
> > > >> > >> > > >> -- Enrico
> > > >> > >> > > >>
> > > >> > >> > > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >>
> > > >> > >>
> > > >> > >> --
> > > >> > >> Jvrao
> > > >> > >> ---
> > > >> > >> First they ignore you, then they laugh at you, then they fight
> > you,
> > > >> then
> > > >> > >> you win. - Mahatma Gandhi
> > > >> > >>
> > > >> > > --
> > > >> > >
> > > >> > >
> > > >> > > -- Enrico Olivelli
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Jvrao
> > ---
> > First they ignore you, then they laugh at you, then they fight you, then
> > you win. - Mahatma Gandhi
> >
>

Reply via email to