Vyacheslav, Anton, Are there any ideas and/or prototypes for the API? Your design suggestions seem to make sense, but I would like to see how it all this will like from user's standpoint.
-Val On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <churaev...@gmail.com> wrote: > Vyacheslav, correct me if something wrong > > We could provide opportunity of choose between CPU usage and MEM/NET usage > for users by compression some attributes of stored objects. > You have learned design, and it is possible to localize changes in > marshalling without performance affect and current functionality. > > I think, that it's usefull for our project and users. > Community, what do you think about this proposal? > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>: > > > In short, > > > > During marshalling a fields is represented as BinaryFieldAccessor which > > manages its marshalling. It checks if the field is marked by annotation > > @BinaryCompression, in that case - binary representation of field (bytes > > array) will be compressed. It will be marked as compressed by types > > constant (GridBinaryMarshaller.COMPRESSED), after this the compressed > > bytes > > array wiil be include in binary representation of whole object. Note, > > header of marshalled object will not be compressed. Compression affected > > only object's field representation. > > > > Objects in IgniteCache is represented as BinaryObject which is wrapper > over > > bytes array of marshalled object. > > BinaryObject provides some usefull methods, which are used by Ignite > > systems. > > For example, the Queries use BinaryObject#field method, which > deserializes > > only field of object, without deserializing of whole object. > > BinaryObject#field method during deserialization, if meets the constant > of > > compressed type, decompress this bytes array, then continue unmarshalling > > as usual. > > > > Now, I introduced the Compressor interface in IgniteConfigurations, it > > allows user to use own implementation of compressor - it is the > requirement > > in the task[1]. > > > > As far as I know, Vladimir Ozerov doesn't like the idea of granting this > > opportunity to the user. > > In that case we can choose a compression algorithm which we will provide > by > > default and will move the interface to internals of binary > infractructure. > > For this case I've prepared benchmarked, which I've sent earlier. > > > > I vote for ZSTD algorithm[2], it provides good compression ratio and good > > throughput. It has implementation in Java, .NET and C++, and has > > ASF-friendly license, we can use it in the all Ignite platforms. > > You can look at an assessment of this algorithm in my benchmark's > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592 > > [2]https://github.com/facebook/zstd > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <churaev...@gmail.com>: > > > > > Looks good for me. > > > > > > Could You propose design of implementation in couple of sentences? > > > So that we can estimate the completeness and complexity of the > proposal. > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com>: > > > > > > > Anton, > > > > > > > > Of course, the solution does not affect on existing implementation. I > > > mean, > > > > there is no changes if user not use the annotation > @BinaryCompression. > > > (no > > > > performance changes) > > > > Only if user make decision to use compression on specific field or > > fields > > > > of a class - in that case compression will be used at marshalling in > > > > relation to annotated fields. > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <churaev...@gmail.com>: > > > > > > > > > Vyacheslav, > > > > > > > > > > Is it possible to propose implementation that can be switched on > > > > on-demand? > > > > > In this case it should not affect performance of current solution. > > > > > > > > > > I mean, that users should make decision what is more important for > > > them: > > > > > throutput or memory/net usage. > > > > > May be they will be choose not all objects, or only some attributes > > of > > > > > objects for compress. > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <daradu...@gmail.com > >: > > > > > > > > > > > Conclusion: > > > > > > Provided solution allows reduce size of an object in IgniteCache > at > > > the > > > > > > cost of throughput reduction (small - in some cases), it depends > on > > > > part > > > > > of > > > > > > object which will be compressed and compression algorithm. > > > > > > I mean, we can make more effective use of memory, and in some > cases > > > it > > > > > can > > > > > > reduce loading of the interconnect. (replication, rebalancing) > > > > > > > > > > > > Especially, it will be particularly useful for object's fields > > which > > > > are > > > > > > large text (>~ 250 bytes) and can be effectively compressed. > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <churaev...@gmail.com>: > > > > > > > > > > > > > Vyacheslav, thank you! But could you please provide a > conclusions > > > or > > > > > > > proposals based on this benchmarks? > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav Daradur < > > daradu...@gmail.com > > > >: > > > > > > > > > > > > > > > Dmitry, > > > > > > > > > > > > > > > > Excel-pages: > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object size, with > > compression > > > > and > > > > > > > > without compression. (Conditions: literal text) > > > > > > > > 1st graph shows compression ratios of using different > > compression > > > > > > > algrithms > > > > > > > > depending on size of compressed field. > > > > > > > > 2nd graph shows evaluation of size of objects depending on > > sizes > > > > and > > > > > > > > compression algorithms. > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object size, with > > compression > > > > and > > > > > > > > without compression. (Conditions: badly compressed character > > > > > sequence) > > > > > > > > 1st graph shows compression ratios of using different > > compression > > > > > > > > algrithms depending on size of compressed field. > > > > > > > > 2nd graph shows evaluation of size of objects depending on > > sizes > > > > and > > > > > > > > compression algorithms. > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the "put" operation > > > depending > > > > on > > > > > > > size > > > > > > > > and compression algorithms. > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the "put" operation > > > depending > > > > on > > > > > > > size > > > > > > > > and compression algorithms. > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the "get" operation > > > depending > > > > on > > > > > > > size > > > > > > > > and compression algorithms. > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the "get" operation > > > depending > > > > on > > > > > > > size > > > > > > > > and compression algorithms. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy Setrakyan < > > > > dsetrak...@apache.org > > > > > >: > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to interpret the graphs? What > are > > > we > > > > > > > looking > > > > > > > > > at? > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM, Vyacheslav Daradur < > > > > > > > daradu...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi, Igniters. > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking. Results [1]. > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation in the form of diagrams > > [2]. > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the community and > > accelerates a > > > > > > > reaction > > > > > > > > to > > > > > > > > > > this improvment :) > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > https://github.com/daradurvs/ignite-compression/tree/ > > > > > > > > > > master/src/main/resources/result > > > > > > > > > > [2] https://drive.google.com/file/d/ > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/ > > > > > > > view > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav Daradur < > > > > > daradu...@gmail.com > > > > > > >: > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts? > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav Daradur < > > > > > > daradu...@gmail.com > > > > > > > >: > > > > > > > > > > > > > > > > > > > > > >> Hi guys, > > > > > > > > > > >> > > > > > > > > > > >> I've prepared the PR to show my idea. > > > > > > > > > > >> https://github.com/apache/ignite/pull/1951/files > > > > > > > > > > >> > > > > > > > > > > >> About querying - I've just copied existing tests and > > have > > > > > > > annotated > > > > > > > > > the > > > > > > > > > > >> testing data. > > > > > > > > > > >> https://github.com/apache/ > ignite/pull/1951/files#diff- > > > > c19a9d > > > > > > > > > > >> f4058141d059bb577e75244764 > > > > > > > > > > >> > > > > > > > > > > >> It means fields which will be marked by > > @BinaryCompression > > > > > will > > > > > > be > > > > > > > > > > >> compressed at marshalling via BinaryMarshaller. > > > > > > > > > > >> > > > > > > > > > > >> This solution has no effect on existing data or > project > > > > > > > > architecture. > > > > > > > > > > >> > > > > > > > > > > >> I'll be glad to see your thougths. > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00 Vyacheslav Daradur < > > > > > > > daradu...@gmail.com > > > > > > > > >: > > > > > > > > > > >> > > > > > > > > > > >>> Dmitriy, > > > > > > > > > > >>> > > > > > > > > > > >>> I have ready prototype. I want to show it. > > > > > > > > > > >>> It is always easier to discuss on example. > > > > > > > > > > >>> > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy Setrakyan < > > > > > > > > dsetrak...@apache.org > > > > > > > > > >: > > > > > > > > > > >>> > > > > > > > > > > >>>> Vyacheslav, > > > > > > > > > > >>>> > > > > > > > > > > >>>> I think it is a bit premature to provide a PR > without > > > > > getting > > > > > > a > > > > > > > > > > >>>> community > > > > > > > > > > >>>> consensus on the dev list. Please allow some time > for > > > the > > > > > > > > community > > > > > > > > > to > > > > > > > > > > >>>> respond. > > > > > > > > > > >>>> > > > > > > > > > > >>>> D. > > > > > > > > > > >>>> > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM, Vyacheslav Daradur > < > > > > > > > > > > >>>> daradu...@gmail.com> > > > > > > > > > > >>>> wrote: > > > > > > > > > > >>>> > > > > > > > > > > >>>> > I created the ticket: > > https://issues.apache.org/jira > > > > > > > > > > >>>> /browse/IGNITE-5226 > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > I'll prepare a PR with described solution in > couple > > of > > > > > days. > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00 Vyacheslav Daradur < > > > > > > > > > daradu...@gmail.com > > > > > > > > > > >: > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > > Hi, Igniters! > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > Apache 2.0 is released. > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > Let's continue the discussion about a > compression > > > > > design. > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > At the moment, I found only one solution which > is > > > > > > compatible > > > > > > > > > with > > > > > > > > > > >>>> > querying > > > > > > > > > > >>>> > > and indexing, this is per-objects-field > > compression. > > > > > > > > > > >>>> > > Per-fields compression means that metadata (a > > > header) > > > > of > > > > > > an > > > > > > > > > object > > > > > > > > > > >>>> won't > > > > > > > > > > >>>> > > be compressed, only serialized values of an > object > > > > > fields > > > > > > > (in > > > > > > > > > > bytes > > > > > > > > > > >>>> array > > > > > > > > > > >>>> > > form) will be compressed. > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > This solution have some contentious issues: > > > > > > > > > > >>>> > > - small values, like primitives and short > arrays - > > > > there > > > > > > > isn't > > > > > > > > > > >>>> sense to > > > > > > > > > > >>>> > > compress them; > > > > > > > > > > >>>> > > - there is no possible to use compression with > > > > > > > java-predefined > > > > > > > > > > >>>> types; > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > We can provide an annotation, > @IgniteCompression - > > > for > > > > > > > > example, > > > > > > > > > > >>>> which can > > > > > > > > > > >>>> > > be used by users for marking fields to compress. > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > Any thoughts? > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > Maybe someone already have ready design? > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00 Vyacheslav Daradur < > > > > > > > > > > daradu...@gmail.com > > > > > > > > > > >>>> >: > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > >> Alexey, > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> Yes, I've read it. > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> Ok, let's discuss about public API design. > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> I think we need to add some a configure entity > to > > > > > > > > > > >>>> CacheConfiguration, > > > > > > > > > > >>>> > >> which will contain the Compressor interface > > > > > > implementation > > > > > > > > and > > > > > > > > > > some > > > > > > > > > > >>>> > usefull > > > > > > > > > > >>>> > >> parameters. > > > > > > > > > > >>>> > >> Or maybe to provide a BinaryMarshaller > decorator, > > > > which > > > > > > > will > > > > > > > > be > > > > > > > > > > >>>> compress > > > > > > > > > > >>>> > >> data after marshalling. > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00 Alexey Kuznetsov < > > > > > > > > > > akuznet...@apache.org > > > > > > > > > > >>>> >: > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >>> Vyacheslav, > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> Did you read initial discussion [1] about > > > > compression? > > > > > > > > > > >>>> > >>> As far as I remember we agreed to add only > some > > > > > > > "top-level" > > > > > > > > > API > > > > > > > > > > in > > > > > > > > > > >>>> > order > > > > > > > > > > >>>> > >>> to > > > > > > > > > > >>>> > >>> provide a way for > > > > > > > > > > >>>> > >>> Ignite users to inject some sort of custom > > > > > compression. > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> [1] > > > > > > > > > > >>>> > >>> http://apache-ignite-developer > > s.2346864.n4.nabble > > > . > > > > > > > > com/Data-c > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-td10099.html > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19 PM, daradurvs < > > > > > > > > > daradu...@gmail.com > > > > > > > > > > > > > > > > > > > > > >>>> > wrote: > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> > Hi Igniters! > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > I am interested in this task. > > > > > > > > > > >>>> > >>> > Provide some kind of pluggable compression > SPI > > > > > support > > > > > > > > > > >>>> > >>> > <https://issues.apache.org/ > > > > jira/browse/IGNITE-3592> > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > I developed a solution on > > > BinaryMarshaller-level, > > > > > but > > > > > > > > > reviewer > > > > > > > > > > >>>> has > > > > > > > > > > >>>> > >>> rejected > > > > > > > > > > >>>> > >>> > it. > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > Let's continue discussion of task goals and > > > > solution > > > > > > > > design. > > > > > > > > > > >>>> > >>> > As I understood that, the main goal of this > > task > > > > is > > > > > to > > > > > > > > store > > > > > > > > > > >>>> data in > > > > > > > > > > >>>> > >>> > compressed form. > > > > > > > > > > >>>> > >>> > This is what I need from Ignite as its user. > > > > > > Compression > > > > > > > > > > >>>> provides > > > > > > > > > > >>>> > >>> economy > > > > > > > > > > >>>> > >>> > on > > > > > > > > > > >>>> > >>> > servers. > > > > > > > > > > >>>> > >>> > We can store more data on same servers at > the > > > cost > > > > > of > > > > > > > > > > >>>> increasing CPU > > > > > > > > > > >>>> > >>> > utilization. > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > I'm researching a possibility of > > implementation > > > of > > > > > > > > > compression > > > > > > > > > > >>>> at the > > > > > > > > > > >>>> > >>> > cache-level. > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > Any thoughts? > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > -- > > > > > > > > > > >>>> > >>> > Best regards, > > > > > > > > > > >>>> > >>> > Vyacheslav > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > -- > > > > > > > > > > >>>> > >>> > View this message in context: > > > > http://apache-ignite- > > > > > > > > > > >>>> > >>> > developers.2346864.n4.nabble. > > > > > com/Data-compression-in- > > > > > > > > > > >>>> > >>> > Ignite-2-0-tp10099p16317.html > > > > > > > > > > >>>> > >>> > Sent from the Apache Ignite Developers > mailing > > > > list > > > > > > > > archive > > > > > > > > > at > > > > > > > > > > >>>> > >>> Nabble.com. > > > > > > > > > > >>>> > >>> > > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >>> -- > > > > > > > > > > >>>> > >>> Alexey Kuznetsov > > > > > > > > > > >>>> > >>> > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > >> -- > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav > > > > > > > > > > >>>> > >> > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > -- > > > > > > > > > > >>>> > > Best Regards, Vyacheslav > > > > > > > > > > >>>> > > > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > -- > > > > > > > > > > >>>> > Best Regards, Vyacheslav > > > > > > > > > > >>>> > > > > > > > > > > > >>>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> > > > > > > > > > > >>> -- > > > > > > > > > > >>> Best Regards, Vyacheslav > > > > > > > > > > >>> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> -- > > > > > > > > > > >> Best Regards, Vyacheslav > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Best Regards, Vyacheslav > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best Regards, Vyacheslav > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best Regards, Vyacheslav > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Best Regards, Anton Churaev > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best Regards, Vyacheslav > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best Regards, Anton Churaev > > > > > > > > > > > > > > > > > > > > > -- > > > > Best Regards, Vyacheslav > > > > > > > > > > > > > > > > -- > > > > > > Best Regards, Anton Churaev > > > > > > > > > > > -- > > Best Regards, Vyacheslav > > > > > > -- > > Best Regards, Anton Churaev >