Vova,

Finally we back to my initial idea - to look how "big databases compress"
data :)


Just to remind how IBM DB2 do this[1].

[1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
1205db210compression/

On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <voze...@gridgain.com>
wrote:

> Vyacheslav,
>
> This is not about my needs, but about the product :-) BinaryObject is a
> central entity used for both data transfer and data storage. This is both
> good and bad at the same time.
>
> Good thing is that as we optimize binary protocol, we improve both network
> and storage performance at the same time. We have at least 3 things which
> will be included into the product soon: varint encoding [1], optimized
> string encoding [2] and null-field optimization [3]. Bad thing is that
> binary object format is not well suited for data storage optimizations,
> including compression. For example, one good compression technique is to
> organize data in column-store format, or to introduce shared "dictionary"
> with unique values on cache level. In both cases N equal values are not
> stored N times. Instead, we store one value and N references to it, or so.
> This way 2x-10x compression is possible depending on workload type. Binary
> object protocol with some compression on top of it cannot give such
> improvement, because it will compress data in individual objects, instead
> of compressing the whole cache data in a single context.
>
> That said, I propose to give up adding compression to BinaryObject. This is
> a dead end. Instead, we should:
> 1) Optimize protocol itself to be more compact, as described in
> aforementioned Ignite tickets
> 2) Start new discussion about storage compression
>
> You can read papers of other vendors to get better understanding on
> possible compression options. E.g. Oracle has a lot of compression
> techniques, including heat maps, background compression, per-block
> compression, data dictionaries, etc. [4].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5097
> [2] https://issues.apache.org/jira/browse/IGNITE-5655
> [3] https://issues.apache.org/jira/browse/IGNITE-3939
> [4] http://www.oracle.com/technetwork/database/options/
> compression/advanced-
> compression-wp-12c-1896128.pdf
>
> Vladimir.
>
>

-- 
Alexey Kuznetsov

Reply via email to