I have tested the performance of the newly released 1.7beta1 against a set of compression algorithms I have been working on. Here are the results for the compression of the Silesia Corpus. The tests were performed on a desktop i7-2600 @3.40GHz, Win7, 16GB RAM
In the encoding and decoding time columns, the first number is Go 1.6 and the second is 1.7beta1. Size Ratio Enc. (sec) Dec. (sec) LZ4: 101,631,119 47.95% 2.5 / 2.3 2.5 / 2.3 Snappy: 101,285,612 47.79% 2.4 / 2.3 2.6 / 2.4 BWT+RANK+ZRLT+Huffman: 52,352,711 24.70% 38.1 / 32.5 23.0 / 22.5 BWT+RANK+ZRLT+Range: 52,061,295 24.56% 38.5 / 32.9 24.6 / 23.7 BWT+RANK+ZRLT+ANS: 52,061,115 24.56% 39.2 / 33.8 23.0 / 22.4 BWT+RANK+ZRLT+FPAQ: 49,584,922 23.40% 49.0 / 41.4 36.4 / 34.0 BWT+CM: 46,505,288 21.94% 91.2 / 71.9 81.8 / 65.8 BWT+PAQ: 46,514,028 21.95% 148.5 / 121.1 140.1 / 117.2 TPAQ: 42,463,928 20.04% 335.3 / 264.2 329.9 / 262.7 The speed improvements are consistent and rather impressive. Explanation of the algorithms and more performance numbers available here: https://github.com/flanglet/kanzi/wiki/Compression-examples. As a data point, here are the Java results: Size Ratio Enc. (sec) Dec. (sec) LZ4: 101,631,119 47.95% 3.8 2.5 Snappy: 101,285,612 47.79% 3.6 2.5 BWT+RANK+ZRLT+Huffman: 52,352,711 24.70% 33.4 21.9 BWT+RANK+ZRLT+Range: 52,061,295 24.56% 33.3 23.7 BWT+RANK+ZRLT+ANS: 52,061,115 24.56% 33.8 21.8 BWT+RANK+ZRLT+FPAQ: 49,584,922 23.40% 37.2 28.3 BWT+CM: 46,505,288 21.94% 53.0 45.9 BWT+PAQ: 46,514,028 21.95% 93.0 86.5 TPAQ: 42,463,928 20.04% 173.0 182.3 With the progress in release 1.7beta1, Go does catch up with Java for the fast compressors (the performance numbers of Java for LZ4 and Snappy are not that useful because a good percentage of the time is used by the JVM warmup). Go still lags behind for the Context Mixing based compressors (which require several function calls to estimate the probability of each bit). I found one oddity when running tests on the Zero Run Length Transform: Go 1.6 ZRLT encoding [ms]: 10678 Throughput [MB/s]: 223 ZRLT decoding [ms]: 7577 Throughput [MB/s]: 314 ZRLT encoding [ms]: 10720 Throughput [MB/s]: 222 ZRLT decoding [ms]: 7573 Throughput [MB/s]: 314 ZRLT encoding [ms]: 10651 Throughput [MB/s]: 223 ZRLT decoding [ms]: 7509 Throughput [MB/s]: 317 Go 1.7beta1 ZRLT encoding [ms]: 7049 Throughput [MB/s]: 338 ZRLT decoding [ms]: 11573 Throughput [MB/s]: 206 ZRLT encoding [ms]: 6910 Throughput [MB/s]: 345 ZRLT decoding [ms]: 12040 Throughput [MB/s]: 198 ZRLT encoding [ms]: 7024 Throughput [MB/s]: 339 ZRLT decoding [ms]: 11894 Throughput [MB/s]: 200 The decoding takes much longer than the encoding now. The culprit is a tight loop to decode the run length (val is a byte): for val < 2 { runLength = (runLength << 1) | int(val) srcIdx++ [...] I changed the loop condition like this (it feels a bit kludgy): for val&1 == val { runLength = (runLength << 1) | int(val) srcIdx++ [...] ZRLT encoding [ms]: 6842 Throughput [MB/s]: 348 ZRLT decoding [ms]: 7704 Throughput [MB/s]: 309 ZRLT encoding [ms]: 6851 Throughput [MB/s]: 347 ZRLT decoding [ms]: 7822 Throughput [MB/s]: 304 ZRLT encoding [ms]: 6823 Throughput [MB/s]: 349 ZRLT decoding [ms]: 7721 Throughput [MB/s]: 308 -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.