Hi Vyacheslav, Yes, I would suggest you to do so.
On Fri, Aug 25, 2017 at 2:51 PM, Vyacheslav Daradur <daradu...@gmail.com> wrote: > Hi, should I close the initial ticket [1] as "Won't Fix" and add link to > the new discusion about storage compression [2] in comments? > > [1] https://issues.apache.org/jira/browse/IGNITE-3592 > [2] http://apache-ignite-developers.2346864.n4.nabble. > com/Data-compression-in-Ignite-td20679.html > > 2017-08-09 23:05 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>: > >> Vladimir, thank you for detailed explanation. >> >> I think I've understanded the main idea of described storage compression. >> >> I'll join the new discussion after researching of given material and >> comlpetion of varint-optimization [1]. >> >> [1] https://issues.apache.org/jira/browse/IGNITE-5097 >> >> 2017-08-02 15:43 GMT+03:00 Alexey Kuznetsov <akuznet...@apache.org>: >> >>> Vova, >>> >>> Finally we back to my initial idea - to look how "big databases compress" >>> data :) >>> >>> >>> Just to remind how IBM DB2 do this[1]. >>> >>> [1] http://www.ibm.com/developerworks/data/library/techarticle/dm- >>> 1205db210compression/ >>> <http://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/> >>> >>> On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <voze...@gridgain.com> >>> wrote: >>> >>> > Vyacheslav, >>> > >>> > This is not about my needs, but about the product :-) BinaryObject is a >>> > central entity used for both data transfer and data storage. This is >>> both >>> > good and bad at the same time. >>> > >>> > Good thing is that as we optimize binary protocol, we improve both >>> network >>> > and storage performance at the same time. We have at least 3 things >>> which >>> > will be included into the product soon: varint encoding [1], optimized >>> > string encoding [2] and null-field optimization [3]. Bad thing is that >>> > binary object format is not well suited for data storage optimizations, >>> > including compression. For example, one good compression technique is >>> to >>> > organize data in column-store format, or to introduce shared >>> "dictionary" >>> > with unique values on cache level. In both cases N equal values are not >>> > stored N times. Instead, we store one value and N references to it, or >>> so. >>> > This way 2x-10x compression is possible depending on workload type. >>> Binary >>> > object protocol with some compression on top of it cannot give such >>> > improvement, because it will compress data in individual objects, >>> instead >>> > of compressing the whole cache data in a single context. >>> > >>> > That said, I propose to give up adding compression to BinaryObject. >>> This is >>> > a dead end. Instead, we should: >>> > 1) Optimize protocol itself to be more compact, as described in >>> > aforementioned Ignite tickets >>> > 2) Start new discussion about storage compression >>> > >>> > You can read papers of other vendors to get better understanding on >>> > possible compression options. E.g. Oracle has a lot of compression >>> > techniques, including heat maps, background compression, per-block >>> > compression, data dictionaries, etc. [4]. >>> > >>> > [1] https://issues.apache.org/jira/browse/IGNITE-5097 >>> > [2] https://issues.apache.org/jira/browse/IGNITE-5655 >>> > [3] https://issues.apache.org/jira/browse/IGNITE-3939 >>> > [4] http://www.oracle.com/technetwork/database/options/ >>> > compression/advanced- >>> > compression-wp-12c-1896128.pdf >>> > >>> > Vladimir. >>> > >>> > >>> >>> -- >>> Alexey Kuznetsov >>> >> >> >> >> -- >> Best Regards, Vyacheslav D. >> > > > > -- > Best Regards, Vyacheslav D. >