Re: Data compression in Ignite 2.0

Dmitriy Setrakyan Mon, 25 Jul 2016 14:00:58 -0700

Nikita, this sounds like a pretty elegant approach.

Does anyone in the community see a problem with this design?


On Mon, Jul 25, 2016 at 4:49 PM, Nikita Ivanov <nivano...@gmail.com> wrote:

> SAP Hana does the compression by 1) compressing SQL parameters before
> execution, and 2) storing only compressed data in memory. This way all SQL
> queries work as normal with zero modifications or performance overhead.
> Only results of the query can be (optionally) decompressed back before
> returning to the user.
>
> --
> Nikita Ivanov
>
>
> On Mon, Jul 25, 2016 at 1:40 PM, Sergey Kozlov <skoz...@gridgain.com>
> wrote:
>
> > Hi
> >
> > For approach 1: Put a large object into a partiton cache will force to
> > update the dictionary placed on replication cache. It seeis it may be
> > time-expense operation.
> > Appoach 2-3 are make sense for rare cases as Sergi commented.
> > Aslo I see a danger of OOM if we've got high compression level and try to
> > restore orginal value in memory.
> >
> > On Mon, Jul 25, 2016 at 10:39 AM, Alexey Kuznetsov <
> > akuznet...@gridgain.com>
> > wrote:
> >
> > > Sergi,
> > >
> > > Of course it will introduce some slowdown, but with compression more
> data
> > > could be stored in memory
> > > and not will be evicted to disk. In case of compress by dictionary
> > > substitution it will be only one more lookup
> > > and should be fast.
> > >
> > > In general we could provide only API for compression out of the box,
> and
> > > users that really need some sort of compression
> > > will implement it by them self. This will not require much effort I
> > think.
> > >
> > >
> > >
> > > On Mon, Jul 25, 2016 at 2:18 PM, Sergi Vladykin <
> > sergi.vlady...@gmail.com>
> > > wrote:
> > >
> > > > This will make sense only for rare cases when you have very large
> > objects
> > > > stored, which can be effectively compressed. And even then it will
> > > > introduce slowdown on all the operations, which often will not be
> > > > acceptable. I guess only few users will find this feature useful,
> thus
> > I
> > > > think it does not worth the effort.
> > > >
> > > > Sergi
> > > >
> > > >
> > > >
> > > > 2016-07-25 9:28 GMT+03:00 Alexey Kuznetsov <akuznet...@gridgain.com
> >:
> > > >
> > > > > Hi, All!
> > > > >
> > > > > I would like to propose one more feature for Ignite 2.0.
> > > > >
> > > > > Data compression for data in binary format.
> > > > >
> > > > > Binary format is stored as field name + field data.
> > > > > So we have a description.
> > > > > How about to add one more byte to binary data descriptor:
> > > > >
> > > > > *Compressed*:
> > > > >  0 - Data stored as is (no compression).
> > > > >  1 - Data compressed by dictionary (something like DB2 row
> > compression
> > > > [1],
> > > > >  but for all binary types). We could have system or user defined
> > > > replicated
> > > > > cache for such dictionary and *cache.compact()* method that will
> scan
> > > > > cache, build dictionary and compact data.
> > > > >  2 - Data compressed by Java built in ZIP.
> > > > >  3 - Data compressed by some user custom algorithm.
> > > > >
> > > > > Of course it is possible to compress data in current Ignite 1.x but
> > in
> > > > this
> > > > > case compressed data cannot be accessed from SQL engine, if we
> > > implement
> > > > > support for compression on Ignite core level SQL engine will be
> able
> > to
> > > > > detect that data is compressed and properly handle such data.
> > > > >
> > > > > What do you think?
> > > > > If community consider this feature useful I will create issue in
> > JIRA.
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> http://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/
> > > > >
> > > > > --
> > > > > Alexey Kuznetsov
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Alexey Kuznetsov
> > >
> >
> >
> >
> > --
> > Sergey Kozlov
> > GridGain Systems
> > www.gridgain.com
> >
>

Re: Data compression in Ignite 2.0

Reply via email to