Hello!
Of course, this setting will be configurable.
Regards,
--
Ilya Kasnacheev
ср, 5 сент. 2018 г. в 3:21, Dmitriy Setrakyan :
> In my view, dictionary of 1024 bytes is not going to be nearly enough.
>
> On Tue, Sep 4, 2018 at 8:06 AM, Ilya Kasnacheev >
> wrote:
>
> > Hello!
> >
> > In cas
In my view, dictionary of 1024 bytes is not going to be nearly enough.
On Tue, Sep 4, 2018 at 8:06 AM, Ilya Kasnacheev
wrote:
> Hello!
>
> In case of Apache Ignite, most of savings is due to BinaryObject format,
> which encodes types and fields with byte sequences. Any enum/string flags
> will a
Hello!
In case of Apache Ignite, most of savings is due to BinaryObject format,
which encodes types and fields with byte sequences. Any enum/string flags
will also get in dictionary. And then as it processes a record it fills up
its individual dictionary.
But, in one cache, most if not all entrie
On Tue, Sep 4, 2018 at 2:55 AM, Ilya Kasnacheev
wrote:
> Hello!
>
> Each node has a local dictionary (per node currently, per cache planned).
> Dictionary is never shared between nodes. As data patterns shift,
> dictionary rotation is also planned.
>
> With Zstd, the best dictionary size seems to
Hello!
Each node has a local dictionary (per node currently, per cache planned).
Dictionary is never shared between nodes. As data patterns shift,
dictionary rotation is also planned.
With Zstd, the best dictionary size seems to be 1024 bytes. I imagine It is
enough to store common BinaryObject b
On Tue, Sep 4, 2018 at 1:16 AM, Ilya Kasnacheev
wrote:
> Hello!
>
> The compression is per-binary-object, but dictionary is external, shared
> between multiple (millions of) entries and stored alongside compressed
> data.
>
I was under a different impression. If the dictionary is for the whole d
Hello!
The compression is per-binary-object, but dictionary is external, shared
between multiple (millions of) entries and stored alongside compressed data.
Regards,
--
Ilya Kasnacheev
вт, 4 сент. 2018 г. в 2:40, Dmitriy Setrakyan :
> Hi Ilya,
>
> This is very useful. Is the compression going
Hi Ilya,
This is very useful. Is the compression going to be per-page, in which case
the dictionary is going to be kept inside of a page? Or do you have some
other design in mind?
D.
On Mon, Sep 3, 2018 at 10:36 AM, Ilya Kasnacheev
wrote:
> Hello again!
>
> I've been running various compressio
Hello again!
I've been running various compression parameters through cod dataset.
It looks like the best compression level in terms of speed is either 1 or 2.
The default for Zstd seems to be 3 which would almost always perform worse.
For best performance a dictionary of 1024 is optimal, for bet
Just as I have started praising Zstd, it began to show JVM crashes in
native code in train dict :(
I guess it has limits to train buffer, after which errorneous behaviour is
exhibited. Maybe we will need to submit a pull request:)
Regards,
--
Ilya Kasnacheev
пт, 31 авг. 2018 г. в 11:56, Ilya K
Hello!
I am testing Zstd with dictionary, and it looks very very promising. I'm
confident I can choose settings where it is faster than my own algo while
bringing better compression ratio, on "cod" dataset.
So I am happliy retiring my code and switching to Zstd. Would probably mean
that we will s
Hello!
Yes, we can tinker with BinaryObject format, which is currently clearly
excessive.
But the best part with compression, it will automatically remove this
redundancy for us, for free. Even if we had hairy XML as binary object
format, it will still compress roughly to the same number of bytes
I have another suggestion which may help us reduce objects size
extremely - implementing some kind of SQL Scheme.
For now, BinaryObject's format is too excessive - each serialized
object stores offset of every serialized field even if the offset can
be easily calculated.
If we move this metadata
According to my benchmarks - zstd compression algorithm [1] looks very
interesting, it has a high compression ratio with quite good speed.
AFAIK it supports external dictionaries, but I'm not sure about using
it with "on the fly building" dictionaries. Anyway, have look at (it
has ASF 2.0 friendly
Hello Vyacheslav!
Unfortunately I have not found any efficient algorithms that will allow me
to use external dictionary as a pre-processed data structure. If plain gzip
is used without dictionary, the compression is around 0.7, as opposed to
0.4 that I will get with custom implementation, AFAIR th
Hi Igniters!
Ilya, I'm glad to see one more person who is interested in the
compression feature in Ignite.
I looked through the pull request and want to share following thoughts:
It's very dangerous using a custom algorithm in this way - you store
serialized data separate from a dictionary and t
>
> Currently, the dictionary for decompression is only stored on heap. After
> restart there's compressed data in the PDS, but there's no dictionary :)
Basically, it means that I've lost my data, right? How about persisting
data to disk.
Overall, we need Vladimir Ozerov to check the contributio
Hello!
It is somewhat a part of IEP-20, since I have updated it with this
particular direction.
Regards,
--
Ilya Kasnacheev
2018-08-24 2:56 GMT+03:00 Denis Magda :
> Hi Ilya,
>
> Sounds terrific! Is this part of the following Ignite enhancement proposal?
> https://cwiki.apache.org/confluence/
Hi Ilya,
Sounds terrific! Is this part of the following Ignite enhancement proposal?
https://cwiki.apache.org/confluence/display/IGNITE/IEP-20%3A+Data+Compression+in+Ignite
--
Denis
On Thu, Aug 23, 2018 at 5:17 AM Ilya Kasnacheev
wrote:
> Hello!
>
> My plan was to add a compression section to
Hello!
My plan was to add a compression section to cache configuration, where you
can enable compression, enable key compression (which has heavier
performance implications), adjust dictionary gathering settings, and in the
future possibly choose betwen algorithms. In fact I'm not sure, since my
a
Ok, thanks. IMO we need to store the dictionary in Durable memory before
merging into master.
чт, 23 авг. 2018 г. в 15:12, Ilya Kasnacheev :
> Hello!
>
> Currently, the dictionary for decompression is only stored on heap. After
> restart there's compressed data in the PDS, but there's no dictiona
Hi Ilya
Is there a plan to introduce it as an option of Ignite configuration? In
that instead the boolean type I suggest to use the enum and reserve the
ability to extend compressions algorithms in future
On Thu, Aug 23, 2018 at 1:09 PM, Ilya Kasnacheev
wrote:
> Hello!
>
> I want to share with
Hello!
Currently, the dictionary for decompression is only stored on heap. After
restart there's compressed data in the PDS, but there's no dictionary :)
Regards,
--
Ilya Kasnacheev
2018-08-23 14:58 GMT+03:00 Dmitriy Pavlov :
> Hi Ilya,
>
> Thank you for sharing this here. I believe this cont
Hi Ilya,
Thank you for sharing this here. I believe this contribution will be
accepted by the Community. Moreover, it shows so remarkable performance
boost.
I'm pretty sure this patch will be reviewed by Ignite Native Persistence
experts soon.
What do you mean by can't survive PDS node restart?
24 matches
Mail list logo