This patch series introduces performance improvements for lzo. The previous version of this patchset is here: https://lkml.org/lkml/2018/11/21/625
This version tidies up the ifdefs as per Christoph's comment (although certainly more could be done, this is at least a bit more consistent with normal kernel coding style). On 23/11/2018 2:12 am, Sergey Senozhatsky wrote: >> The graph below shows the weighted round-trip throughput of lzo, lz4 and >> lzo-rle, for randomly generated 4k chunks of data with varying levels of >> entropy. (To calculate weighted round-trip throughput, compression >> performance >> is emphasised to reflect the fact that zram does around 2.25x more >> compression >> than decompression. > > Right. The number is data dependent. Not all swapped out pages can be > compressed; compressed pages that end up being >= zs_huge_class_size() are > considered incompressible and stored as it. > > I'd say that on my setups around 50-60% of pages are incompressible. So, just to give a bit more detail: the test setup was a Samsung Chromebook Pro, cycling through 80 tabs in Chrome. With lzo-rle, only 5% of pages increased in size, and 90% of pages compress to 75% of original size (or better). Mean compression ratio was 41%. Importantly for lzo-rle, there are a lot of low-entropy pages where it can do well: in total about 20% of the data is zeros forming part of a run of 4 or more bytes. As a quick summary of the impact of these patches on bigger chunks of data, I've compared the performance of four different variants of lzo on two large (~40 MB) files. The numbers show round-trip throughput in MB/s: Variant | Low-entropy | High-entropy Current lzo | 242 | 157 Arm opts | 290 | 159 RLE | 876 | 151 Arm opts + RLE | 1150 | 181 So both the Arm optimisations (8,16-byte copy & CTZ patches), and the RLE implementation make a significant contribution to the overall performance uplift.