Hi Vyacheslav,

Yes, I would suggest you to do so.

On Fri, Aug 25, 2017 at 2:51 PM, Vyacheslav Daradur <daradu...@gmail.com>
wrote:

> Hi, should I close the initial ticket [1] as "Won't Fix" and add link to
> the new discusion about storage compression [2] in comments?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-3592
> [2] http://apache-ignite-developers.2346864.n4.nabble.
> com/Data-compression-in-Ignite-td20679.html
>
> 2017-08-09 23:05 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>:
>
>> Vladimir, thank you for detailed explanation.
>>
>> I think I've understanded the main idea of described storage compression.
>>
>> I'll join the new discussion after researching of given material and
>> comlpetion of varint-optimization [1].
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-5097
>>
>> 2017-08-02 15:43 GMT+03:00 Alexey Kuznetsov <akuznet...@apache.org>:
>>
>>> Vova,
>>>
>>> Finally we back to my initial idea - to look how "big databases compress"
>>> data :)
>>>
>>>
>>> Just to remind how IBM DB2 do this[1].
>>>
>>> [1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
>>> 1205db210compression/
>>> <http://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/>
>>>
>>> On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <voze...@gridgain.com>
>>> wrote:
>>>
>>> > Vyacheslav,
>>> >
>>> > This is not about my needs, but about the product :-) BinaryObject is a
>>> > central entity used for both data transfer and data storage. This is
>>> both
>>> > good and bad at the same time.
>>> >
>>> > Good thing is that as we optimize binary protocol, we improve both
>>> network
>>> > and storage performance at the same time. We have at least 3 things
>>> which
>>> > will be included into the product soon: varint encoding [1], optimized
>>> > string encoding [2] and null-field optimization [3]. Bad thing is that
>>> > binary object format is not well suited for data storage optimizations,
>>> > including compression. For example, one good compression technique is
>>> to
>>> > organize data in column-store format, or to introduce shared
>>> "dictionary"
>>> > with unique values on cache level. In both cases N equal values are not
>>> > stored N times. Instead, we store one value and N references to it, or
>>> so.
>>> > This way 2x-10x compression is possible depending on workload type.
>>> Binary
>>> > object protocol with some compression on top of it cannot give such
>>> > improvement, because it will compress data in individual objects,
>>> instead
>>> > of compressing the whole cache data in a single context.
>>> >
>>> > That said, I propose to give up adding compression to BinaryObject.
>>> This is
>>> > a dead end. Instead, we should:
>>> > 1) Optimize protocol itself to be more compact, as described in
>>> > aforementioned Ignite tickets
>>> > 2) Start new discussion about storage compression
>>> >
>>> > You can read papers of other vendors to get better understanding on
>>> > possible compression options. E.g. Oracle has a lot of compression
>>> > techniques, including heat maps, background compression, per-block
>>> > compression, data dictionaries, etc. [4].
>>> >
>>> > [1] https://issues.apache.org/jira/browse/IGNITE-5097
>>> > [2] https://issues.apache.org/jira/browse/IGNITE-5655
>>> > [3] https://issues.apache.org/jira/browse/IGNITE-3939
>>> > [4] http://www.oracle.com/technetwork/database/options/
>>> > compression/advanced-
>>> > compression-wp-12c-1896128.pdf
>>> >
>>> > Vladimir.
>>> >
>>> >
>>>
>>> --
>>> Alexey Kuznetsov
>>>
>>
>>
>>
>> --
>> Best Regards, Vyacheslav D.
>>
>
>
>
> --
> Best Regards, Vyacheslav D.
>

Reply via email to