Sergey Kozlov wrote: >> For approach 1: Put a large object into a partition cache will force to update the dictionary placed on replication cache. It may be time-expense operation. The dictionary will be built only once. And we could control what should be put into dictionary, for example, we could check min and max size and decide - put value to dictionary or not.
>> Approach 2-3 are make sense for rare cases as Sergi commented. But it is better at least have a possibility to plug user code for compression than not to have it at all. >> Also I see a danger of OOM if we've got high compression level and try to restore original value in memory. We could easily get OOM with many other operations right now without compression, I think it is not an issue, we could add a NOTE to documentation about such possibility. Andrey Kornev wrote: >> ... in general I think compression is a great data. The cleanest way to achieve that would be to just make it possible to chain the marshallers... I think it is also good idea. And looks like it could be used for compression with some sort of ZIP algorithm, but how to deal with compression by dictionary substitution? We need to build dictionary first. Any ideas? Nikita Ivanov wrote: >> SAP Hana does the compression by 1) compressing SQL parameters before execution... Looks interesting, but my initial point was about compression of cache data, not SQL queries. My idea was to make compression transparent for SQL engine when it will lookup for data. But idea of compressing SQL queries result looks very interesting, because it is known fact, that SQL engine could consume quite a lot of heap for storing result sets. I think this should be discussed in separate thread. Just for you information, in first message I mentioned that DB2 has compression by dictionary and according to them it is possible to compress usual data to 50-80%. I have some experience with DB2 and can confirm this. -- Alexey Kuznetsov