Data compression in Ignite 2.0

classic Classic list List threaded Threaded
59 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Anton,

>> I thought that if there will storing compressed data in the memory, data
>> will transmit over wire in compression too. Is it right?

In per-field compression case - yes.

2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:

> Guys, could you please help me.
> I thought that if there will storing compressed data in the memory, data
> will transmit over wire in compression too. Is it right?
>
> 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
>
> > Vladimir,
> >
> > The main problem which I'am trying to solve is storing data in memory in
> a
> > compression form via Ignite.
> > The main goal is using memory more effectivelly.
> >
> > >> here the much simpler step would be to full
> > compression on per-cache basis rather than dealing with per-fields case.
> >
> > Please explain your idea. Compess data by memory-page?
> > Is it compatible with quering and indexing?
> >
> > >> In the end, if user would like to compress particular field, he can
> > always to it on his own
> > I think we mustn't think in this way, if user need something he trying to
> > choose a tool which has this feature OOTB.
> >
> >
> >
> > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> >
> > > Igniters,
> > >
> > > Honestly I still do not see how to apply it gracefully this feature ti
> > > Ignite. And overall approach to compress only particular fields looks
> > > overcomplicated to me. Remember, that our main use case is an
> application
> > > without classes on the server. It means that any kind of annotations
> are
> > > inapplicable. To be more precise: proper API should be implemented to
> > > handle no-class case (e.g. how would build such an object through
> > > BinaryBuilder without a class?), and only then add annotations as
> > > convenient addition to more basic API.
> > >
> > > It seems to me that full implementation, which takes in count proper
> > > "classless" API, changes to binary metadata to reflect compressed
> fields,
> > > changes to SQL, changes to binary protocol, and porting to .NET and
> CPP,
> > > will yield very complex solution with little value to the product.
> > >
> > > Instead, as I proposed earlier, it seems that we'd better start with
> the
> > > problem we are trying to solve. Basically, compression could help in
> two
> > > cases:
> > > 1) Transmitting data over wire - it should be implemented on
> > communication
> > > layer and should not affect binary serialization component a lot.
> > > 2) Storing data in memory - here the much simpler step would be to full
> > > compression on per-cache basis rather than dealing with per-fields
> case.
> > >
> > > In the end, if user would like to compress particular field, he can
> > always
> > > to it on his own, and set already compressed field to our BinaryObject.
> > >
> > > Vladimir.
> > >
> > >
> > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> [hidden email]
> > >
> > > wrote:
> > >
> > > > Valentin,
> > > >
> > > > Yes, I have the prototype[1][2]
> > > >
> > > > You can see an example of Java class[3] that I used in my benchmark.
> > > > For example:
> > > > class Foo {
> > > > @BinaryCompression
> > > > String data;
> > > > }
> > > > If user make decision to store the object in compressed form, he can
> > use
> > > > the annotation @BinaryCompression as shown above.
> > > > It means annotated field 'data' will be compressed at marshalling.
> > > >
> > > > [1] https://github.com/apache/ignite/pull/1951
> > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > [3]
> > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > model/Audit1F.java
> > > >
> > > >
> > > >
> > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > [hidden email]
> > > > >:
> > > >
> > > > > Vyacheslav, Anton,
> > > > >
> > > > > Are there any ideas and/or prototypes for the API? Your design
> > > > suggestions
> > > > > seem to make sense, but I would like to see how it all this will
> like
> > > > from
> > > > > user's standpoint.
> > > > >
> > > > > -Val
> > > > >
> > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <[hidden email]
> >
> > > > wrote:
> > > > >
> > > > > > Vyacheslav, correct me if something wrong
> > > > > >
> > > > > > We could provide opportunity of choose between CPU usage and
> > MEM/NET
> > > > > usage
> > > > > > for users by compression some attributes of stored objects.
> > > > > > You have learned design, and it is possible to localize changes
> in
> > > > > > marshalling without performance affect and current functionality.
> > > > > >
> > > > > > I think, that it's usefull for our project and users.
> > > > > > Community, what do you think about this proposal?
> > > > > >
> > > > > >
> > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> [hidden email]
> > >:
> > > > > >
> > > > > > > In short,
> > > > > > >
> > > > > > > During marshalling a fields is represented as
> BinaryFieldAccessor
> > > > which
> > > > > > > manages its marshalling. It checks if the field is marked by
> > > > annotation
> > > > > > > @BinaryCompression, in that case - binary  representation of
> > field
> > > > > (bytes
> > > > > > > array) will be compressed. It will be marked as compressed by
> > types
> > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after this the
> > > > compressed
> > > > > > > bytes
> > > > > > > array wiil be include in binary representation of whole object.
> > > Note,
> > > > > > > header of marshalled object will not be compressed. Compression
> > > > > affected
> > > > > > > only object's field representation.
> > > > > > >
> > > > > > > Objects in IgniteCache is represented as BinaryObject which is
> > > > wrapper
> > > > > > over
> > > > > > > bytes array of marshalled object.
> > > > > > > BinaryObject provides some usefull methods, which are used by
> > > Ignite
> > > > > > > systems.
> > > > > > > For example, the Queries use BinaryObject#field method, which
> > > > > > deserializes
> > > > > > > only field of object, without deserializing of whole object.
> > > > > > > BinaryObject#field method during deserialization, if meets the
> > > > constant
> > > > > > of
> > > > > > > compressed type, decompress this bytes array, then continue
> > > > > unmarshalling
> > > > > > > as usual.
> > > > > > >
> > > > > > > Now, I introduced the Compressor interface in
> > IgniteConfigurations,
> > > > it
> > > > > > > allows user to use own implementation of compressor - it is the
> > > > > > requirement
> > > > > > > in the task[1].
> > > > > > >
> > > > > > > As far as I know, Vladimir Ozerov doesn't like the idea of
> > granting
> > > > > this
> > > > > > > opportunity to the user.
> > > > > > > In that case we can choose a compression algorithm which we
> will
> > > > > provide
> > > > > > by
> > > > > > > default and will move the interface to internals of binary
> > > > > > infractructure.
> > > > > > > For this case I've prepared benchmarked, which I've sent
> earlier.
> > > > > > >
> > > > > > > I vote for ZSTD algorithm[2], it provides good compression
> ratio
> > > and
> > > > > good
> > > > > > > throughput. It has implementation in Java, .NET and C++, and
> has
> > > > > > > ASF-friendly license, we can use it in the all Ignite
> platforms.
> > > > > > > You can look at an assessment of this algorithm in my
> benchmark's
> > > > > > >
> > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592
> > > > > > > [2]https://github.com/facebook/zstd
> > > > > > >
> > > > > > >
> > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <[hidden email]
> >:
> > > > > > >
> > > > > > > > Looks good for me.
> > > > > > > >
> > > > > > > > Could You propose design of implementation in couple of
> > > sentences?
> > > > > > > > So that we can estimate the completeness and complexity of
> the
> > > > > > proposal.
> > > > > > > >
> > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > [hidden email]
> > > > >:
> > > > > > > >
> > > > > > > > > Anton,
> > > > > > > > >
> > > > > > > > > Of course, the solution does not affect on existing
> > > > > implementation. I
> > > > > > > > mean,
> > > > > > > > > there is no changes if user not use the annotation
> > > > > > @BinaryCompression.
> > > > > > > > (no
> > > > > > > > > performance changes)
> > > > > > > > > Only if user make decision to use compression on specific
> > field
> > > > or
> > > > > > > fields
> > > > > > > > > of a class - in that case compression will be used at
> > > marshalling
> > > > > in
> > > > > > > > > relation to annotated fields.
> > > > > > > > >
> > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > [hidden email]
> > > >:
> > > > > > > > >
> > > > > > > > > > Vyacheslav,
> > > > > > > > > >
> > > > > > > > > > Is it possible to propose implementation that can be
> > switched
> > > > on
> > > > > > > > > on-demand?
> > > > > > > > > > In this case it should not affect performance of current
> > > > > solution.
> > > > > > > > > >
> > > > > > > > > > I mean, that users should make decision what is more
> > > important
> > > > > for
> > > > > > > > them:
> > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > May be they will be choose not all objects, or only some
> > > > > attributes
> > > > > > > of
> > > > > > > > > > objects for compress.
> > > > > > > > > >
> > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Conclusion:
> > > > > > > > > > > Provided solution allows reduce size of an object in
> > > > > IgniteCache
> > > > > > at
> > > > > > > > the
> > > > > > > > > > > cost of throughput reduction (small - in some cases),
> it
> > > > > depends
> > > > > > on
> > > > > > > > > part
> > > > > > > > > > of
> > > > > > > > > > > object which will be compressed and compression
> > algorithm.
> > > > > > > > > > > I mean, we can make more effective use of memory, and
> in
> > > some
> > > > > > cases
> > > > > > > > it
> > > > > > > > > > can
> > > > > > > > > > > reduce loading of the interconnect. (replication,
> > > > rebalancing)
> > > > > > > > > > >
> > > > > > > > > > > Especially, it will be particularly useful for object's
> > > > fields
> > > > > > > which
> > > > > > > > > are
> > > > > > > > > > > large text (>~ 250 bytes) and can be effectively
> > > compressed.
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > [hidden email]
> > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, thank you! But could you please provide a
> > > > > > conclusions
> > > > > > > > or
> > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object size,
> with
> > > > > > > compression
> > > > > > > > > and
> > > > > > > > > > > > > without compression. (Conditions: literal text)
> > > > > > > > > > > > > 1st graph shows compression ratios of using
> different
> > > > > > > compression
> > > > > > > > > > > > algrithms
> > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > depending
> > > > on
> > > > > > > sizes
> > > > > > > > > and
> > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object size,
> with
> > > > > > > compression
> > > > > > > > > and
> > > > > > > > > > > > > without compression. (Conditions:  badly compressed
> > > > > character
> > > > > > > > > > sequence)
> > > > > > > > > > > > > 1st graph shows compression ratios of using
> different
> > > > > > > compression
> > > > > > > > > > > > > algrithms depending on size of compressed field.
> > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > depending
> > > > on
> > > > > > > sizes
> > > > > > > > > and
> > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3) 'put-avg" - shows average time of the "put"
> > > operation
> > > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > size
> > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the "put"
> > > operation
> > > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > size
> > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 5) 'get-avg" - shows average time of the "get"
> > > operation
> > > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > size
> > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the "get"
> > > operation
> > > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > size
> > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy Setrakyan <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Vladimir, I am not sure how to interpret the
> > graphs?
> > > > What
> > > > > > are
> > > > > > > > we
> > > > > > > > > > > > looking
> > > > > > > > > > > > > > at?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM, Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I've prepared some benchmarking. Results [1].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > And I've prepared the evaluation in the form of
> > > > > diagrams
> > > > > > > [2].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I hope that helps to interest the community and
> > > > > > > accelerates a
> > > > > > > > > > > > reaction
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > ignite-compression/tree/
> > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > view
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav
> Daradur <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> I've prepared the PR to show my idea.
> > > > > > > > > > > > > > > >> https://github.com/apache/
> > > ignite/pull/1951/files
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> About querying - I've just copied existing
> > tests
> > > > and
> > > > > > > have
> > > > > > > > > > > > annotated
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > c19a9d
> > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> It means fields which will be marked by
> > > > > > > @BinaryCompression
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > >> compressed at marshalling via
> > BinaryMarshaller.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> This solution has no effect on existing data
> > or
> > > > > > project
> > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00 Vyacheslav
> Daradur
> > <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>> I have ready prototype. I want to show it.
> > > > > > > > > > > > > > > >>> It is always easier to discuss on example.
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy
> Setrakyan
> > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > >>>> I think it is a bit premature to provide a
> > PR
> > > > > > without
> > > > > > > > > > getting
> > > > > > > > > > > a
> > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > >>>> consensus on the dev list. Please allow
> some
> > > > time
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > community
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> > I'll prepare a PR with described
> solution
> > in
> > > > > > couple
> > > > > > > of
> > > > > > > > > > days.
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00 Vyacheslav
> > > Daradur
> > > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > Let's continue the discussion about a
> > > > > > compression
> > > > > > > > > > design.
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > At the moment, I found only one
> solution
> > > > which
> > > > > > is
> > > > > > > > > > > compatible
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > >>>> > > and indexing, this is
> per-objects-field
> > > > > > > compression.
> > > > > > > > > > > > > > > >>>> > > Per-fields compression means that
> > metadata
> > > > (a
> > > > > > > > header)
> > > > > > > > > of
> > > > > > > > > > > an
> > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > >>>> > > be compressed, only serialized values
> of
> > > an
> > > > > > object
> > > > > > > > > > fields
> > > > > > > > > > > > (in
> > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > This solution have some contentious
> > > issues:
> > > > > > > > > > > > > > > >>>> > > - small values, like primitives and
> > short
> > > > > > arrays -
> > > > > > > > > there
> > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > >>>> > > - there is no possible to use
> > compression
> > > > with
> > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > We can provide an annotation,
> > > > > > @IgniteCompression -
> > > > > > > > for
> > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > >>>> > > be used by users for marking fields to
> > > > > compress.
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > Maybe someone already have ready
> design?
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00 Vyacheslav
> > > > Daradur
> > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about public API
> > > design.
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >> I think we need to add some a
> configure
> > > > > entity
> > > > > > to
> > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > >>>> > >> which will contain the Compressor
> > > interface
> > > > > > > > > > > implementation
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> BinaryMarshaller
> > > > > > decorator,
> > > > > > > > > which
> > > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00 Alexey
> > > > Kuznetsov <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>> Did you read initial discussion [1]
> > > about
> > > > > > > > > compression?
> > > > > > > > > > > > > > > >>>> > >>> As far as I remember we agreed to
> add
> > > only
> > > > > > some
> > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject some sort of
> > > custom
> > > > > > > > > > compression.
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > >>>> > >>> http://apache-ignite-developer
> > > > > > > s.2346864.n4.nabble
> > > > > > > > .
> > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> td10099.html
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19 PM,
> > > > daradurvs <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > I am interested in this task.
> > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of pluggable
> > > > compression
> > > > > > SPI
> > > > > > > > > > support
> > > > > > > > > > > > > > > >>>> > >>> > <https://issues.apache.org/
> > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > I developed a solution on
> > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > but
> > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > Let's continue discussion of task
> > > goals
> > > > > and
> > > > > > > > > solution
> > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > >>>> > >>> > As I understood that, the main
> goal
> > of
> > > > > this
> > > > > > > task
> > > > > > > > > is
> > > > > > > > > > to
> > > > > > > > > > > > > store
> > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > >>>> > >>> > This is what I need from Ignite as
> > its
> > > > > user.
> > > > > > > > > > > Compression
> > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > >>>> > >>> > We can store more data on same
> > servers
> > > > at
> > > > > > the
> > > > > > > > cost
> > > > > > > > > > of
> > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > I'm researching a possibility of
> > > > > > > implementation
> > > > > > > > of
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > >>>> > >>> > View this message in context:
> > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > >>>> > >>> > developers.2346864.n4.nabble.
> > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > >>>> > >>> > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache Ignite
> > Developers
> > > > > > mailing
> > > > > > > > > list
> > > > > > > > > > > > > archive
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards, Anton Churaev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best Regards, Anton Churaev
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav
> >
>
>
>
> --
>
> Best Regards, Anton Churaev
>



--
Best Regards, Vyacheslav
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

dsetrakyan
Igniters,

I have never seen a single Ignite user asking about compressing a single
field. However, we have had requests to secure certain fields, e.g.
passwords.

I personally do not think per-field compression is needed, unless we can
point out some concrete real life use cases.

D.

On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <[hidden email]>
wrote:

> Anton,
>
> >> I thought that if there will storing compressed data in the memory, data
> >> will transmit over wire in compression too. Is it right?
>
> In per-field compression case - yes.
>
> 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
>
> > Guys, could you please help me.
> > I thought that if there will storing compressed data in the memory, data
> > will transmit over wire in compression too. Is it right?
> >
> > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
> >
> > > Vladimir,
> > >
> > > The main problem which I'am trying to solve is storing data in memory
> in
> > a
> > > compression form via Ignite.
> > > The main goal is using memory more effectivelly.
> > >
> > > >> here the much simpler step would be to full
> > > compression on per-cache basis rather than dealing with per-fields
> case.
> > >
> > > Please explain your idea. Compess data by memory-page?
> > > Is it compatible with quering and indexing?
> > >
> > > >> In the end, if user would like to compress particular field, he can
> > > always to it on his own
> > > I think we mustn't think in this way, if user need something he trying
> to
> > > choose a tool which has this feature OOTB.
> > >
> > >
> > >
> > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > >
> > > > Igniters,
> > > >
> > > > Honestly I still do not see how to apply it gracefully this feature
> ti
> > > > Ignite. And overall approach to compress only particular fields looks
> > > > overcomplicated to me. Remember, that our main use case is an
> > application
> > > > without classes on the server. It means that any kind of annotations
> > are
> > > > inapplicable. To be more precise: proper API should be implemented to
> > > > handle no-class case (e.g. how would build such an object through
> > > > BinaryBuilder without a class?), and only then add annotations as
> > > > convenient addition to more basic API.
> > > >
> > > > It seems to me that full implementation, which takes in count proper
> > > > "classless" API, changes to binary metadata to reflect compressed
> > fields,
> > > > changes to SQL, changes to binary protocol, and porting to .NET and
> > CPP,
> > > > will yield very complex solution with little value to the product.
> > > >
> > > > Instead, as I proposed earlier, it seems that we'd better start with
> > the
> > > > problem we are trying to solve. Basically, compression could help in
> > two
> > > > cases:
> > > > 1) Transmitting data over wire - it should be implemented on
> > > communication
> > > > layer and should not affect binary serialization component a lot.
> > > > 2) Storing data in memory - here the much simpler step would be to
> full
> > > > compression on per-cache basis rather than dealing with per-fields
> > case.
> > > >
> > > > In the end, if user would like to compress particular field, he can
> > > always
> > > > to it on his own, and set already compressed field to our
> BinaryObject.
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Valentin,
> > > > >
> > > > > Yes, I have the prototype[1][2]
> > > > >
> > > > > You can see an example of Java class[3] that I used in my
> benchmark.
> > > > > For example:
> > > > > class Foo {
> > > > > @BinaryCompression
> > > > > String data;
> > > > > }
> > > > > If user make decision to store the object in compressed form, he
> can
> > > use
> > > > > the annotation @BinaryCompression as shown above.
> > > > > It means annotated field 'data' will be compressed at marshalling.
> > > > >
> > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > [3]
> > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > model/Audit1F.java
> > > > >
> > > > >
> > > > >
> > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > [hidden email]
> > > > > >:
> > > > >
> > > > > > Vyacheslav, Anton,
> > > > > >
> > > > > > Are there any ideas and/or prototypes for the API? Your design
> > > > > suggestions
> > > > > > seem to make sense, but I would like to see how it all this will
> > like
> > > > > from
> > > > > > user's standpoint.
> > > > > >
> > > > > > -Val
> > > > > >
> > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> [hidden email]
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Vyacheslav, correct me if something wrong
> > > > > > >
> > > > > > > We could provide opportunity of choose between CPU usage and
> > > MEM/NET
> > > > > > usage
> > > > > > > for users by compression some attributes of stored objects.
> > > > > > > You have learned design, and it is possible to localize changes
> > in
> > > > > > > marshalling without performance affect and current
> functionality.
> > > > > > >
> > > > > > > I think, that it's usefull for our project and users.
> > > > > > > Community, what do you think about this proposal?
> > > > > > >
> > > > > > >
> > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > [hidden email]
> > > >:
> > > > > > >
> > > > > > > > In short,
> > > > > > > >
> > > > > > > > During marshalling a fields is represented as
> > BinaryFieldAccessor
> > > > > which
> > > > > > > > manages its marshalling. It checks if the field is marked by
> > > > > annotation
> > > > > > > > @BinaryCompression, in that case - binary  representation of
> > > field
> > > > > > (bytes
> > > > > > > > array) will be compressed. It will be marked as compressed by
> > > types
> > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after this the
> > > > > compressed
> > > > > > > > bytes
> > > > > > > > array wiil be include in binary representation of whole
> object.
> > > > Note,
> > > > > > > > header of marshalled object will not be compressed.
> Compression
> > > > > > affected
> > > > > > > > only object's field representation.
> > > > > > > >
> > > > > > > > Objects in IgniteCache is represented as BinaryObject which
> is
> > > > > wrapper
> > > > > > > over
> > > > > > > > bytes array of marshalled object.
> > > > > > > > BinaryObject provides some usefull methods, which are used by
> > > > Ignite
> > > > > > > > systems.
> > > > > > > > For example, the Queries use BinaryObject#field method, which
> > > > > > > deserializes
> > > > > > > > only field of object, without deserializing of whole object.
> > > > > > > > BinaryObject#field method during deserialization, if meets
> the
> > > > > constant
> > > > > > > of
> > > > > > > > compressed type, decompress this bytes array, then continue
> > > > > > unmarshalling
> > > > > > > > as usual.
> > > > > > > >
> > > > > > > > Now, I introduced the Compressor interface in
> > > IgniteConfigurations,
> > > > > it
> > > > > > > > allows user to use own implementation of compressor - it is
> the
> > > > > > > requirement
> > > > > > > > in the task[1].
> > > > > > > >
> > > > > > > > As far as I know, Vladimir Ozerov doesn't like the idea of
> > > granting
> > > > > > this
> > > > > > > > opportunity to the user.
> > > > > > > > In that case we can choose a compression algorithm which we
> > will
> > > > > > provide
> > > > > > > by
> > > > > > > > default and will move the interface to internals of binary
> > > > > > > infractructure.
> > > > > > > > For this case I've prepared benchmarked, which I've sent
> > earlier.
> > > > > > > >
> > > > > > > > I vote for ZSTD algorithm[2], it provides good compression
> > ratio
> > > > and
> > > > > > good
> > > > > > > > throughput. It has implementation in Java, .NET and C++, and
> > has
> > > > > > > > ASF-friendly license, we can use it in the all Ignite
> > platforms.
> > > > > > > > You can look at an assessment of this algorithm in my
> > benchmark's
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592
> > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > >
> > > > > > > >
> > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> [hidden email]
> > >:
> > > > > > > >
> > > > > > > > > Looks good for me.
> > > > > > > > >
> > > > > > > > > Could You propose design of implementation in couple of
> > > > sentences?
> > > > > > > > > So that we can estimate the completeness and complexity of
> > the
> > > > > > > proposal.
> > > > > > > > >
> > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > [hidden email]
> > > > > >:
> > > > > > > > >
> > > > > > > > > > Anton,
> > > > > > > > > >
> > > > > > > > > > Of course, the solution does not affect on existing
> > > > > > implementation. I
> > > > > > > > > mean,
> > > > > > > > > > there is no changes if user not use the annotation
> > > > > > > @BinaryCompression.
> > > > > > > > > (no
> > > > > > > > > > performance changes)
> > > > > > > > > > Only if user make decision to use compression on specific
> > > field
> > > > > or
> > > > > > > > fields
> > > > > > > > > > of a class - in that case compression will be used at
> > > > marshalling
> > > > > > in
> > > > > > > > > > relation to annotated fields.
> > > > > > > > > >
> > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > [hidden email]
> > > > >:
> > > > > > > > > >
> > > > > > > > > > > Vyacheslav,
> > > > > > > > > > >
> > > > > > > > > > > Is it possible to propose implementation that can be
> > > switched
> > > > > on
> > > > > > > > > > on-demand?
> > > > > > > > > > > In this case it should not affect performance of
> current
> > > > > > solution.
> > > > > > > > > > >
> > > > > > > > > > > I mean, that users should make decision what is more
> > > > important
> > > > > > for
> > > > > > > > > them:
> > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > May be they will be choose not all objects, or only
> some
> > > > > > attributes
> > > > > > > > of
> > > > > > > > > > > objects for compress.
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > Provided solution allows reduce size of an object in
> > > > > > IgniteCache
> > > > > > > at
> > > > > > > > > the
> > > > > > > > > > > > cost of throughput reduction (small - in some cases),
> > it
> > > > > > depends
> > > > > > > on
> > > > > > > > > > part
> > > > > > > > > > > of
> > > > > > > > > > > > object which will be compressed and compression
> > > algorithm.
> > > > > > > > > > > > I mean, we can make more effective use of memory, and
> > in
> > > > some
> > > > > > > cases
> > > > > > > > > it
> > > > > > > > > > > can
> > > > > > > > > > > > reduce loading of the interconnect. (replication,
> > > > > rebalancing)
> > > > > > > > > > > >
> > > > > > > > > > > > Especially, it will be particularly useful for
> object's
> > > > > fields
> > > > > > > > which
> > > > > > > > > > are
> > > > > > > > > > > > large text (>~ 250 bytes) and can be effectively
> > > > compressed.
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav, thank you! But could you please
> provide a
> > > > > > > conclusions
> > > > > > > > > or
> > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object size,
> > with
> > > > > > > > compression
> > > > > > > > > > and
> > > > > > > > > > > > > > without compression. (Conditions: literal text)
> > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > different
> > > > > > > > compression
> > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > depending
> > > > > on
> > > > > > > > sizes
> > > > > > > > > > and
> > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object size,
> > with
> > > > > > > > compression
> > > > > > > > > > and
> > > > > > > > > > > > > > without compression. (Conditions:  badly
> compressed
> > > > > > character
> > > > > > > > > > > sequence)
> > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > different
> > > > > > > > compression
> > > > > > > > > > > > > > algrithms depending on size of compressed field.
> > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > depending
> > > > > on
> > > > > > > > sizes
> > > > > > > > > > and
> > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the "put"
> > > > operation
> > > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > size
> > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the "put"
> > > > operation
> > > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > size
> > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the "get"
> > > > operation
> > > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > size
> > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the "get"
> > > > operation
> > > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > size
> > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy Setrakyan <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Vladimir, I am not sure how to interpret the
> > > graphs?
> > > > > What
> > > > > > > are
> > > > > > > > > we
> > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM, Vyacheslav
> > > Daradur <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I've prepared some benchmarking. Results [1].
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > And I've prepared the evaluation in the form
> of
> > > > > > diagrams
> > > > > > > > [2].
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I hope that helps to interest the community
> and
> > > > > > > > accelerates a
> > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > view
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav Daradur
> <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> I've prepared the PR to show my idea.
> > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> About querying - I've just copied existing
> > > tests
> > > > > and
> > > > > > > > have
> > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> It means fields which will be marked by
> > > > > > > > @BinaryCompression
> > > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > BinaryMarshaller.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> This solution has no effect on existing
> data
> > > or
> > > > > > > project
> > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00 Vyacheslav
> > Daradur
> > > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> I have ready prototype. I want to show
> it.
> > > > > > > > > > > > > > > > >>> It is always easier to discuss on
> example.
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy
> > Setrakyan
> > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > >>>> I think it is a bit premature to
> provide a
> > > PR
> > > > > > > without
> > > > > > > > > > > getting
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > >>>> consensus on the dev list. Please allow
> > some
> > > > > time
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with described
> > solution
> > > in
> > > > > > > couple
> > > > > > > > of
> > > > > > > > > > > days.
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00 Vyacheslav
> > > > Daradur
> > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > Let's continue the discussion about
> a
> > > > > > > compression
> > > > > > > > > > > design.
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > At the moment, I found only one
> > solution
> > > > > which
> > > > > > > is
> > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > per-objects-field
> > > > > > > > compression.
> > > > > > > > > > > > > > > > >>>> > > Per-fields compression means that
> > > metadata
> > > > > (a
> > > > > > > > > header)
> > > > > > > > > > of
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > >>>> > > be compressed, only serialized
> values
> > of
> > > > an
> > > > > > > object
> > > > > > > > > > > fields
> > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > This solution have some contentious
> > > > issues:
> > > > > > > > > > > > > > > > >>>> > > - small values, like primitives and
> > > short
> > > > > > > arrays -
> > > > > > > > > > there
> > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > >>>> > > - there is no possible to use
> > > compression
> > > > > with
> > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > We can provide an annotation,
> > > > > > > @IgniteCompression -
> > > > > > > > > for
> > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > >>>> > > be used by users for marking fields
> to
> > > > > > compress.
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > Maybe someone already have ready
> > design?
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about public API
> > > > design.
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >> I think we need to add some a
> > configure
> > > > > > entity
> > > > > > > to
> > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > >>>> > >> which will contain the Compressor
> > > > interface
> > > > > > > > > > > > implementation
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > BinaryMarshaller
> > > > > > > decorator,
> > > > > > > > > > which
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00 Alexey
> > > > > Kuznetsov <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>> Did you read initial discussion
> [1]
> > > > about
> > > > > > > > > > compression?
> > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we agreed to
> > add
> > > > only
> > > > > > > some
> > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject some sort
> of
> > > > custom
> > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > >>>> > >>> http://apache-ignite-developer
> > > > > > > > s.2346864.n4.nabble
> > > > > > > > > .
> > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> > td10099.html
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19 PM,
> > > > > daradurvs <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > I am interested in this task.
> > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of pluggable
> > > > > compression
> > > > > > > SPI
> > > > > > > > > > > support
> > > > > > > > > > > > > > > > >>>> > >>> > <https://issues.apache.org/
> > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution on
> > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > but
> > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > Let's continue discussion of
> task
> > > > goals
> > > > > > and
> > > > > > > > > > solution
> > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > >>>> > >>> > As I understood that, the main
> > goal
> > > of
> > > > > > this
> > > > > > > > task
> > > > > > > > > > is
> > > > > > > > > > > to
> > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > >>>> > >>> > This is what I need from Ignite
> as
> > > its
> > > > > > user.
> > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > >>>> > >>> > We can store more data on same
> > > servers
> > > > > at
> > > > > > > the
> > > > > > > > > cost
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a possibility of
> > > > > > > > implementation
> > > > > > > > > of
> > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > >>>> > >>> > View this message in context:
> > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > >>>> > >>> > developers.2346864.n4.nabble.
> > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > >>>> > >>> > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache Ignite
> > > Developers
> > > > > > > mailing
> > > > > > > > > > list
> > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best Regards, Anton Churaev
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
> >
> >
> > --
> >
> > Best Regards, Anton Churaev
> >
>
>
>
> --
> Best Regards, Vyacheslav
>
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Guys, I want to be clear:
* "Per-field compression" design is the result of a research of the binary
infrastructure of Ignite and some other its places (querying, indexing,
etc.)
* Full-compression of object will be more effective, but in this case there
is no capability with querying and indexing (or there is large overhead by
way of decompressing of full object (or caches pages) on demand)
* "Per-field compression" is a one of ways to implement the compression
feature

I'm new to Ignite also I can be mistaken in some things.
Last 3-4 month I've tryed to start dicussion about a design, but nobody
answers nothing (except Dmitry and Valentin who was interested how it
works).
But I understand that this is community and nobody is obliged to anybody.

There are strong Ignite experts.
If they can help me and community with a design of the compression feature
it will be great.
At the moment I have a desire and time to be engaged in development of
compression feature in Ignite.
Let's use this opportunity :)

2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> Igniters,
>
> I have never seen a single Ignite user asking about compressing a single
> field. However, we have had requests to secure certain fields, e.g.
> passwords.
>
> I personally do not think per-field compression is needed, unless we can
> point out some concrete real life use cases.
>
> D.
>
> On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <[hidden email]>
> wrote:
>
> > Anton,
> >
> > >> I thought that if there will storing compressed data in the memory,
> data
> > >> will transmit over wire in compression too. Is it right?
> >
> > In per-field compression case - yes.
> >
> > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> >
> > > Guys, could you please help me.
> > > I thought that if there will storing compressed data in the memory,
> data
> > > will transmit over wire in compression too. Is it right?
> > >
> > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
> > >
> > > > Vladimir,
> > > >
> > > > The main problem which I'am trying to solve is storing data in memory
> > in
> > > a
> > > > compression form via Ignite.
> > > > The main goal is using memory more effectivelly.
> > > >
> > > > >> here the much simpler step would be to full
> > > > compression on per-cache basis rather than dealing with per-fields
> > case.
> > > >
> > > > Please explain your idea. Compess data by memory-page?
> > > > Is it compatible with quering and indexing?
> > > >
> > > > >> In the end, if user would like to compress particular field, he
> can
> > > > always to it on his own
> > > > I think we mustn't think in this way, if user need something he
> trying
> > to
> > > > choose a tool which has this feature OOTB.
> > > >
> > > >
> > > >
> > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > > >
> > > > > Igniters,
> > > > >
> > > > > Honestly I still do not see how to apply it gracefully this feature
> > ti
> > > > > Ignite. And overall approach to compress only particular fields
> looks
> > > > > overcomplicated to me. Remember, that our main use case is an
> > > application
> > > > > without classes on the server. It means that any kind of
> annotations
> > > are
> > > > > inapplicable. To be more precise: proper API should be implemented
> to
> > > > > handle no-class case (e.g. how would build such an object through
> > > > > BinaryBuilder without a class?), and only then add annotations as
> > > > > convenient addition to more basic API.
> > > > >
> > > > > It seems to me that full implementation, which takes in count
> proper
> > > > > "classless" API, changes to binary metadata to reflect compressed
> > > fields,
> > > > > changes to SQL, changes to binary protocol, and porting to .NET and
> > > CPP,
> > > > > will yield very complex solution with little value to the product.
> > > > >
> > > > > Instead, as I proposed earlier, it seems that we'd better start
> with
> > > the
> > > > > problem we are trying to solve. Basically, compression could help
> in
> > > two
> > > > > cases:
> > > > > 1) Transmitting data over wire - it should be implemented on
> > > > communication
> > > > > layer and should not affect binary serialization component a lot.
> > > > > 2) Storing data in memory - here the much simpler step would be to
> > full
> > > > > compression on per-cache basis rather than dealing with per-fields
> > > case.
> > > > >
> > > > > In the end, if user would like to compress particular field, he can
> > > > always
> > > > > to it on his own, and set already compressed field to our
> > BinaryObject.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > >
> > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > [hidden email]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Valentin,
> > > > > >
> > > > > > Yes, I have the prototype[1][2]
> > > > > >
> > > > > > You can see an example of Java class[3] that I used in my
> > benchmark.
> > > > > > For example:
> > > > > > class Foo {
> > > > > > @BinaryCompression
> > > > > > String data;
> > > > > > }
> > > > > > If user make decision to store the object in compressed form, he
> > can
> > > > use
> > > > > > the annotation @BinaryCompression as shown above.
> > > > > > It means annotated field 'data' will be compressed at
> marshalling.
> > > > > >
> > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > [3]
> > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > model/Audit1F.java
> > > > > >
> > > > > >
> > > > > >
> > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > [hidden email]
> > > > > > >:
> > > > > >
> > > > > > > Vyacheslav, Anton,
> > > > > > >
> > > > > > > Are there any ideas and/or prototypes for the API? Your design
> > > > > > suggestions
> > > > > > > seem to make sense, but I would like to see how it all this
> will
> > > like
> > > > > > from
> > > > > > > user's standpoint.
> > > > > > >
> > > > > > > -Val
> > > > > > >
> > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > [hidden email]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > >
> > > > > > > > We could provide opportunity of choose between CPU usage and
> > > > MEM/NET
> > > > > > > usage
> > > > > > > > for users by compression some attributes of stored objects.
> > > > > > > > You have learned design, and it is possible to localize
> changes
> > > in
> > > > > > > > marshalling without performance affect and current
> > functionality.
> > > > > > > >
> > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > Community, what do you think about this proposal?
> > > > > > > >
> > > > > > > >
> > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > [hidden email]
> > > > >:
> > > > > > > >
> > > > > > > > > In short,
> > > > > > > > >
> > > > > > > > > During marshalling a fields is represented as
> > > BinaryFieldAccessor
> > > > > > which
> > > > > > > > > manages its marshalling. It checks if the field is marked
> by
> > > > > > annotation
> > > > > > > > > @BinaryCompression, in that case - binary  representation
> of
> > > > field
> > > > > > > (bytes
> > > > > > > > > array) will be compressed. It will be marked as compressed
> by
> > > > types
> > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after this the
> > > > > > compressed
> > > > > > > > > bytes
> > > > > > > > > array wiil be include in binary representation of whole
> > object.
> > > > > Note,
> > > > > > > > > header of marshalled object will not be compressed.
> > Compression
> > > > > > > affected
> > > > > > > > > only object's field representation.
> > > > > > > > >
> > > > > > > > > Objects in IgniteCache is represented as BinaryObject which
> > is
> > > > > > wrapper
> > > > > > > > over
> > > > > > > > > bytes array of marshalled object.
> > > > > > > > > BinaryObject provides some usefull methods, which are used
> by
> > > > > Ignite
> > > > > > > > > systems.
> > > > > > > > > For example, the Queries use BinaryObject#field method,
> which
> > > > > > > > deserializes
> > > > > > > > > only field of object, without deserializing of whole
> object.
> > > > > > > > > BinaryObject#field method during deserialization, if meets
> > the
> > > > > > constant
> > > > > > > > of
> > > > > > > > > compressed type, decompress this bytes array, then continue
> > > > > > > unmarshalling
> > > > > > > > > as usual.
> > > > > > > > >
> > > > > > > > > Now, I introduced the Compressor interface in
> > > > IgniteConfigurations,
> > > > > > it
> > > > > > > > > allows user to use own implementation of compressor - it is
> > the
> > > > > > > > requirement
> > > > > > > > > in the task[1].
> > > > > > > > >
> > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the idea of
> > > > granting
> > > > > > > this
> > > > > > > > > opportunity to the user.
> > > > > > > > > In that case we can choose a compression algorithm which we
> > > will
> > > > > > > provide
> > > > > > > > by
> > > > > > > > > default and will move the interface to internals of binary
> > > > > > > > infractructure.
> > > > > > > > > For this case I've prepared benchmarked, which I've sent
> > > earlier.
> > > > > > > > >
> > > > > > > > > I vote for ZSTD algorithm[2], it provides good compression
> > > ratio
> > > > > and
> > > > > > > good
> > > > > > > > > throughput. It has implementation in Java, .NET and C++,
> and
> > > has
> > > > > > > > > ASF-friendly license, we can use it in the all Ignite
> > > platforms.
> > > > > > > > > You can look at an assessment of this algorithm in my
> > > benchmark's
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592
> > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > [hidden email]
> > > >:
> > > > > > > > >
> > > > > > > > > > Looks good for me.
> > > > > > > > > >
> > > > > > > > > > Could You propose design of implementation in couple of
> > > > > sentences?
> > > > > > > > > > So that we can estimate the completeness and complexity
> of
> > > the
> > > > > > > > proposal.
> > > > > > > > > >
> > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Anton,
> > > > > > > > > > >
> > > > > > > > > > > Of course, the solution does not affect on existing
> > > > > > > implementation. I
> > > > > > > > > > mean,
> > > > > > > > > > > there is no changes if user not use the annotation
> > > > > > > > @BinaryCompression.
> > > > > > > > > > (no
> > > > > > > > > > > performance changes)
> > > > > > > > > > > Only if user make decision to use compression on
> specific
> > > > field
> > > > > > or
> > > > > > > > > fields
> > > > > > > > > > > of a class - in that case compression will be used at
> > > > > marshalling
> > > > > > > in
> > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > [hidden email]
> > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > >
> > > > > > > > > > > > Is it possible to propose implementation that can be
> > > > switched
> > > > > > on
> > > > > > > > > > > on-demand?
> > > > > > > > > > > > In this case it should not affect performance of
> > current
> > > > > > > solution.
> > > > > > > > > > > >
> > > > > > > > > > > > I mean, that users should make decision what is more
> > > > > important
> > > > > > > for
> > > > > > > > > > them:
> > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > May be they will be choose not all objects, or only
> > some
> > > > > > > attributes
> > > > > > > > > of
> > > > > > > > > > > > objects for compress.
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > Provided solution allows reduce size of an object
> in
> > > > > > > IgniteCache
> > > > > > > > at
> > > > > > > > > > the
> > > > > > > > > > > > > cost of throughput reduction (small - in some
> cases),
> > > it
> > > > > > > depends
> > > > > > > > on
> > > > > > > > > > > part
> > > > > > > > > > > > of
> > > > > > > > > > > > > object which will be compressed and compression
> > > > algorithm.
> > > > > > > > > > > > > I mean, we can make more effective use of memory,
> and
> > > in
> > > > > some
> > > > > > > > cases
> > > > > > > > > > it
> > > > > > > > > > > > can
> > > > > > > > > > > > > reduce loading of the interconnect. (replication,
> > > > > > rebalancing)
> > > > > > > > > > > > >
> > > > > > > > > > > > > Especially, it will be particularly useful for
> > object's
> > > > > > fields
> > > > > > > > > which
> > > > > > > > > > > are
> > > > > > > > > > > > > large text (>~ 250 bytes) and can be effectively
> > > > > compressed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Vyacheslav, thank you! But could you please
> > provide a
> > > > > > > > conclusions
> > > > > > > > > > or
> > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object
> size,
> > > with
> > > > > > > > > compression
> > > > > > > > > > > and
> > > > > > > > > > > > > > > without compression. (Conditions: literal text)
> > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > different
> > > > > > > > > compression
> > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > > depending
> > > > > > on
> > > > > > > > > sizes
> > > > > > > > > > > and
> > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object
> size,
> > > with
> > > > > > > > > compression
> > > > > > > > > > > and
> > > > > > > > > > > > > > > without compression. (Conditions:  badly
> > compressed
> > > > > > > character
> > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > different
> > > > > > > > > compression
> > > > > > > > > > > > > > > algrithms depending on size of compressed
> field.
> > > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > > depending
> > > > > > on
> > > > > > > > > sizes
> > > > > > > > > > > and
> > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the "put"
> > > > > operation
> > > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the "put"
> > > > > operation
> > > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the "get"
> > > > > operation
> > > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the "get"
> > > > > operation
> > > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy Setrakyan <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Vladimir, I am not sure how to interpret the
> > > > graphs?
> > > > > > What
> > > > > > > > are
> > > > > > > > > > we
> > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM, Vyacheslav
> > > > Daradur <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I've prepared some benchmarking. Results
> [1].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > And I've prepared the evaluation in the
> form
> > of
> > > > > > > diagrams
> > > > > > > > > [2].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I hope that helps to interest the community
> > and
> > > > > > > > > accelerates a
> > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav
> Daradur
> > <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav
> > > Daradur <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> I've prepared the PR to show my idea.
> > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> About querying - I've just copied
> existing
> > > > tests
> > > > > > and
> > > > > > > > > have
> > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> It means fields which will be marked by
> > > > > > > > > @BinaryCompression
> > > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> This solution has no effect on existing
> > data
> > > > or
> > > > > > > > project
> > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00 Vyacheslav
> > > Daradur
> > > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>> I have ready prototype. I want to show
> > it.
> > > > > > > > > > > > > > > > > >>> It is always easier to discuss on
> > example.
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy
> > > Setrakyan
> > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > >>>> I think it is a bit premature to
> > provide a
> > > > PR
> > > > > > > > without
> > > > > > > > > > > > getting
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > >>>> consensus on the dev list. Please
> allow
> > > some
> > > > > > time
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with described
> > > solution
> > > > in
> > > > > > > > couple
> > > > > > > > > of
> > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > Let's continue the discussion
> about
> > a
> > > > > > > > compression
> > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > At the moment, I found only one
> > > solution
> > > > > > which
> > > > > > > > is
> > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > per-objects-field
> > > > > > > > > compression.
> > > > > > > > > > > > > > > > > >>>> > > Per-fields compression means that
> > > > metadata
> > > > > > (a
> > > > > > > > > > header)
> > > > > > > > > > > of
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > >>>> > > be compressed, only serialized
> > values
> > > of
> > > > > an
> > > > > > > > object
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > This solution have some
> contentious
> > > > > issues:
> > > > > > > > > > > > > > > > > >>>> > > - small values, like primitives
> and
> > > > short
> > > > > > > > arrays -
> > > > > > > > > > > there
> > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > >>>> > > - there is no possible to use
> > > > compression
> > > > > > with
> > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > We can provide an annotation,
> > > > > > > > @IgniteCompression -
> > > > > > > > > > for
> > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > >>>> > > be used by users for marking
> fields
> > to
> > > > > > > compress.
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have ready
> > > design?
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about public
> API
> > > > > design.
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >> I think we need to add some a
> > > configure
> > > > > > > entity
> > > > > > > > to
> > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > >>>> > >> which will contain the Compressor
> > > > > interface
> > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > BinaryMarshaller
> > > > > > > > decorator,
> > > > > > > > > > > which
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00 Alexey
> > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial discussion
> > [1]
> > > > > about
> > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we agreed
> to
> > > add
> > > > > only
> > > > > > > > some
> > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject some sort
> > of
> > > > > custom
> > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > >>>> > >>> http://apache-ignite-developer
> > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > .
> > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> > > td10099.html
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19 PM,
> > > > > > daradurvs <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in this task.
> > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of pluggable
> > > > > > compression
> > > > > > > > SPI
> > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > >>>> > >>> > <https://issues.apache.org/
> > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution on
> > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > but
> > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue discussion of
> > task
> > > > > goals
> > > > > > > and
> > > > > > > > > > > solution
> > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that, the main
> > > goal
> > > > of
> > > > > > > this
> > > > > > > > > task
> > > > > > > > > > > is
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need from
> Ignite
> > as
> > > > its
> > > > > > > user.
> > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data on same
> > > > servers
> > > > > > at
> > > > > > > > the
> > > > > > > > > > cost
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a possibility
> of
> > > > > > > > > implementation
> > > > > > > > > > of
> > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > >>>> > >>> > View this message in context:
> > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > >>>> > >>> > developers.2346864.n4.nabble.
> > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > >>>> > >>> > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache Ignite
> > > > Developers
> > > > > > > > mailing
> > > > > > > > > > > list
> > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards, Anton Churaev
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Vyacheslav
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Best Regards, Anton Churaev
> > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav
> >
>



--
Best Regards, Vyacheslav
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

dsetrakyan
Vyacheslav,

When this feature started out as data compression in Ignite, it sounded
very useful. Now it is unfolding as a per-field compression, which is much
less useful. In fact, it is questionable whether it is useful at all. The
fact that this feature is implemented does not make it mandatory for the
community to accept it.

However, as I mentioned before, per-field encryption is very useful, as it
would allow users automatically encrypt certain sensitive fields, like
passwords, credit card numbers, etc. There is not much conceptual
difference between compressing a field vs encrypting a field. Would it be
possible to change your implementation to handle the encryption instead?

D.

On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <[hidden email]>
wrote:

> Guys, I want to be clear:
> * "Per-field compression" design is the result of a research of the binary
> infrastructure of Ignite and some other its places (querying, indexing,
> etc.)
> * Full-compression of object will be more effective, but in this case there
> is no capability with querying and indexing (or there is large overhead by
> way of decompressing of full object (or caches pages) on demand)
> * "Per-field compression" is a one of ways to implement the compression
> feature
>
> I'm new to Ignite also I can be mistaken in some things.
> Last 3-4 month I've tryed to start dicussion about a design, but nobody
> answers nothing (except Dmitry and Valentin who was interested how it
> works).
> But I understand that this is community and nobody is obliged to anybody.
>
> There are strong Ignite experts.
> If they can help me and community with a design of the compression feature
> it will be great.
> At the moment I have a desire and time to be engaged in development of
> compression feature in Ignite.
> Let's use this opportunity :)
>
> 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
> > Igniters,
> >
> > I have never seen a single Ignite user asking about compressing a single
> > field. However, we have had requests to secure certain fields, e.g.
> > passwords.
> >
> > I personally do not think per-field compression is needed, unless we can
> > point out some concrete real life use cases.
> >
> > D.
> >
> > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <[hidden email]>
> > wrote:
> >
> > > Anton,
> > >
> > > >> I thought that if there will storing compressed data in the memory,
> > data
> > > >> will transmit over wire in compression too. Is it right?
> > >
> > > In per-field compression case - yes.
> > >
> > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> > >
> > > > Guys, could you please help me.
> > > > I thought that if there will storing compressed data in the memory,
> > data
> > > > will transmit over wire in compression too. Is it right?
> > > >
> > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
> > > >
> > > > > Vladimir,
> > > > >
> > > > > The main problem which I'am trying to solve is storing data in
> memory
> > > in
> > > > a
> > > > > compression form via Ignite.
> > > > > The main goal is using memory more effectivelly.
> > > > >
> > > > > >> here the much simpler step would be to full
> > > > > compression on per-cache basis rather than dealing with per-fields
> > > case.
> > > > >
> > > > > Please explain your idea. Compess data by memory-page?
> > > > > Is it compatible with quering and indexing?
> > > > >
> > > > > >> In the end, if user would like to compress particular field, he
> > can
> > > > > always to it on his own
> > > > > I think we mustn't think in this way, if user need something he
> > trying
> > > to
> > > > > choose a tool which has this feature OOTB.
> > > > >
> > > > >
> > > > >
> > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > Honestly I still do not see how to apply it gracefully this
> feature
> > > ti
> > > > > > Ignite. And overall approach to compress only particular fields
> > looks
> > > > > > overcomplicated to me. Remember, that our main use case is an
> > > > application
> > > > > > without classes on the server. It means that any kind of
> > annotations
> > > > are
> > > > > > inapplicable. To be more precise: proper API should be
> implemented
> > to
> > > > > > handle no-class case (e.g. how would build such an object through
> > > > > > BinaryBuilder without a class?), and only then add annotations as
> > > > > > convenient addition to more basic API.
> > > > > >
> > > > > > It seems to me that full implementation, which takes in count
> > proper
> > > > > > "classless" API, changes to binary metadata to reflect compressed
> > > > fields,
> > > > > > changes to SQL, changes to binary protocol, and porting to .NET
> and
> > > > CPP,
> > > > > > will yield very complex solution with little value to the
> product.
> > > > > >
> > > > > > Instead, as I proposed earlier, it seems that we'd better start
> > with
> > > > the
> > > > > > problem we are trying to solve. Basically, compression could help
> > in
> > > > two
> > > > > > cases:
> > > > > > 1) Transmitting data over wire - it should be implemented on
> > > > > communication
> > > > > > layer and should not affect binary serialization component a lot.
> > > > > > 2) Storing data in memory - here the much simpler step would be
> to
> > > full
> > > > > > compression on per-cache basis rather than dealing with
> per-fields
> > > > case.
> > > > > >
> > > > > > In the end, if user would like to compress particular field, he
> can
> > > > > always
> > > > > > to it on his own, and set already compressed field to our
> > > BinaryObject.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > [hidden email]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Valentin,
> > > > > > >
> > > > > > > Yes, I have the prototype[1][2]
> > > > > > >
> > > > > > > You can see an example of Java class[3] that I used in my
> > > benchmark.
> > > > > > > For example:
> > > > > > > class Foo {
> > > > > > > @BinaryCompression
> > > > > > > String data;
> > > > > > > }
> > > > > > > If user make decision to store the object in compressed form,
> he
> > > can
> > > > > use
> > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > It means annotated field 'data' will be compressed at
> > marshalling.
> > > > > > >
> > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > [3]
> > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > model/Audit1F.java
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > [hidden email]
> > > > > > > >:
> > > > > > >
> > > > > > > > Vyacheslav, Anton,
> > > > > > > >
> > > > > > > > Are there any ideas and/or prototypes for the API? Your
> design
> > > > > > > suggestions
> > > > > > > > seem to make sense, but I would like to see how it all this
> > will
> > > > like
> > > > > > > from
> > > > > > > > user's standpoint.
> > > > > > > >
> > > > > > > > -Val
> > > > > > > >
> > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > [hidden email]
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > >
> > > > > > > > > We could provide opportunity of choose between CPU usage
> and
> > > > > MEM/NET
> > > > > > > > usage
> > > > > > > > > for users by compression some attributes of stored objects.
> > > > > > > > > You have learned design, and it is possible to localize
> > changes
> > > > in
> > > > > > > > > marshalling without performance affect and current
> > > functionality.
> > > > > > > > >
> > > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > [hidden email]
> > > > > >:
> > > > > > > > >
> > > > > > > > > > In short,
> > > > > > > > > >
> > > > > > > > > > During marshalling a fields is represented as
> > > > BinaryFieldAccessor
> > > > > > > which
> > > > > > > > > > manages its marshalling. It checks if the field is marked
> > by
> > > > > > > annotation
> > > > > > > > > > @BinaryCompression, in that case - binary  representation
> > of
> > > > > field
> > > > > > > > (bytes
> > > > > > > > > > array) will be compressed. It will be marked as
> compressed
> > by
> > > > > types
> > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after this
> the
> > > > > > > compressed
> > > > > > > > > > bytes
> > > > > > > > > > array wiil be include in binary representation of whole
> > > object.
> > > > > > Note,
> > > > > > > > > > header of marshalled object will not be compressed.
> > > Compression
> > > > > > > > affected
> > > > > > > > > > only object's field representation.
> > > > > > > > > >
> > > > > > > > > > Objects in IgniteCache is represented as BinaryObject
> which
> > > is
> > > > > > > wrapper
> > > > > > > > > over
> > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > BinaryObject provides some usefull methods, which are
> used
> > by
> > > > > > Ignite
> > > > > > > > > > systems.
> > > > > > > > > > For example, the Queries use BinaryObject#field method,
> > which
> > > > > > > > > deserializes
> > > > > > > > > > only field of object, without deserializing of whole
> > object.
> > > > > > > > > > BinaryObject#field method during deserialization, if
> meets
> > > the
> > > > > > > constant
> > > > > > > > > of
> > > > > > > > > > compressed type, decompress this bytes array, then
> continue
> > > > > > > > unmarshalling
> > > > > > > > > > as usual.
> > > > > > > > > >
> > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > IgniteConfigurations,
> > > > > > > it
> > > > > > > > > > allows user to use own implementation of compressor - it
> is
> > > the
> > > > > > > > > requirement
> > > > > > > > > > in the task[1].
> > > > > > > > > >
> > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the idea
> of
> > > > > granting
> > > > > > > > this
> > > > > > > > > > opportunity to the user.
> > > > > > > > > > In that case we can choose a compression algorithm which
> we
> > > > will
> > > > > > > > provide
> > > > > > > > > by
> > > > > > > > > > default and will move the interface to internals of
> binary
> > > > > > > > > infractructure.
> > > > > > > > > > For this case I've prepared benchmarked, which I've sent
> > > > earlier.
> > > > > > > > > >
> > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> compression
> > > > ratio
> > > > > > and
> > > > > > > > good
> > > > > > > > > > throughput. It has implementation in Java, .NET and C++,
> > and
> > > > has
> > > > > > > > > > ASF-friendly license, we can use it in the all Ignite
> > > > platforms.
> > > > > > > > > > You can look at an assessment of this algorithm in my
> > > > benchmark's
> > > > > > > > > >
> > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592
> > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > [hidden email]
> > > > >:
> > > > > > > > > >
> > > > > > > > > > > Looks good for me.
> > > > > > > > > > >
> > > > > > > > > > > Could You propose design of implementation in couple of
> > > > > > sentences?
> > > > > > > > > > > So that we can estimate the completeness and complexity
> > of
> > > > the
> > > > > > > > > proposal.
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Anton,
> > > > > > > > > > > >
> > > > > > > > > > > > Of course, the solution does not affect on existing
> > > > > > > > implementation. I
> > > > > > > > > > > mean,
> > > > > > > > > > > > there is no changes if user not use the annotation
> > > > > > > > > @BinaryCompression.
> > > > > > > > > > > (no
> > > > > > > > > > > > performance changes)
> > > > > > > > > > > > Only if user make decision to use compression on
> > specific
> > > > > field
> > > > > > > or
> > > > > > > > > > fields
> > > > > > > > > > > > of a class - in that case compression will be used at
> > > > > > marshalling
> > > > > > > > in
> > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Is it possible to propose implementation that can
> be
> > > > > switched
> > > > > > > on
> > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > In this case it should not affect performance of
> > > current
> > > > > > > > solution.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I mean, that users should make decision what is
> more
> > > > > > important
> > > > > > > > for
> > > > > > > > > > > them:
> > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > May be they will be choose not all objects, or only
> > > some
> > > > > > > > attributes
> > > > > > > > > > of
> > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > Provided solution allows reduce size of an object
> > in
> > > > > > > > IgniteCache
> > > > > > > > > at
> > > > > > > > > > > the
> > > > > > > > > > > > > > cost of throughput reduction (small - in some
> > cases),
> > > > it
> > > > > > > > depends
> > > > > > > > > on
> > > > > > > > > > > > part
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > object which will be compressed and compression
> > > > > algorithm.
> > > > > > > > > > > > > > I mean, we can make more effective use of memory,
> > and
> > > > in
> > > > > > some
> > > > > > > > > cases
> > > > > > > > > > > it
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > reduce loading of the interconnect. (replication,
> > > > > > > rebalancing)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Especially, it will be particularly useful for
> > > object's
> > > > > > > fields
> > > > > > > > > > which
> > > > > > > > > > > > are
> > > > > > > > > > > > > > large text (>~ 250 bytes) and can be effectively
> > > > > > compressed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Vyacheslav, thank you! But could you please
> > > provide a
> > > > > > > > > conclusions
> > > > > > > > > > > or
> > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object
> > size,
> > > > with
> > > > > > > > > > compression
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > without compression. (Conditions: literal
> text)
> > > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > > different
> > > > > > > > > > compression
> > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > > > depending
> > > > > > > on
> > > > > > > > > > sizes
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object
> > size,
> > > > with
> > > > > > > > > > compression
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > without compression. (Conditions:  badly
> > > compressed
> > > > > > > > character
> > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > > different
> > > > > > > > > > compression
> > > > > > > > > > > > > > > > algrithms depending on size of compressed
> > field.
> > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of objects
> > > > > depending
> > > > > > > on
> > > > > > > > > > sizes
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the
> "put"
> > > > > > operation
> > > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the
> "put"
> > > > > > operation
> > > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the
> "get"
> > > > > > operation
> > > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the
> "get"
> > > > > > operation
> > > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy Setrakyan
> <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Vladimir, I am not sure how to interpret
> the
> > > > > graphs?
> > > > > > > What
> > > > > > > > > are
> > > > > > > > > > > we
> > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM, Vyacheslav
> > > > > Daradur <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I've prepared some benchmarking. Results
> > [1].
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > And I've prepared the evaluation in the
> > form
> > > of
> > > > > > > > diagrams
> > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I hope that helps to interest the
> community
> > > and
> > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav
> > Daradur
> > > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav
> > > > Daradur <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> I've prepared the PR to show my idea.
> > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> About querying - I've just copied
> > existing
> > > > > tests
> > > > > > > and
> > > > > > > > > > have
> > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> It means fields which will be marked
> by
> > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> This solution has no effect on
> existing
> > > data
> > > > > or
> > > > > > > > > project
> > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00 Vyacheslav
> > > > Daradur
> > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>> I have ready prototype. I want to
> show
> > > it.
> > > > > > > > > > > > > > > > > > >>> It is always easier to discuss on
> > > example.
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy
> > > > Setrakyan
> > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > >>>> I think it is a bit premature to
> > > provide a
> > > > > PR
> > > > > > > > > without
> > > > > > > > > > > > > getting
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > >>>> consensus on the dev list. Please
> > allow
> > > > some
> > > > > > > time
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with described
> > > > solution
> > > > > in
> > > > > > > > > couple
> > > > > > > > > > of
> > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > Let's continue the discussion
> > about
> > > a
> > > > > > > > > compression
> > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found only one
> > > > solution
> > > > > > > which
> > > > > > > > > is
> > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > per-objects-field
> > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression means
> that
> > > > > metadata
> > > > > > > (a
> > > > > > > > > > > header)
> > > > > > > > > > > > of
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > >>>> > > be compressed, only serialized
> > > values
> > > > of
> > > > > > an
> > > > > > > > > object
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > contentious
> > > > > > issues:
> > > > > > > > > > > > > > > > > > >>>> > > - small values, like primitives
> > and
> > > > > short
> > > > > > > > > arrays -
> > > > > > > > > > > > there
> > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > >>>> > > - there is no possible to use
> > > > > compression
> > > > > > > with
> > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > We can provide an annotation,
> > > > > > > > > @IgniteCompression -
> > > > > > > > > > > for
> > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > >>>> > > be used by users for marking
> > fields
> > > to
> > > > > > > > compress.
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have ready
> > > > design?
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about public
> > API
> > > > > > design.
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add some a
> > > > configure
> > > > > > > > entity
> > > > > > > > > to
> > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> Compressor
> > > > > > interface
> > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > BinaryMarshaller
> > > > > > > > > decorator,
> > > > > > > > > > > > which
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00
> Alexey
> > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> discussion
> > > [1]
> > > > > > about
> > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we agreed
> > to
> > > > add
> > > > > > only
> > > > > > > > > some
> > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject some
> sort
> > > of
> > > > > > custom
> > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > >>>> > >>>
> http://apache-ignite-developer
> > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > .
> > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> > > > td10099.html
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19
> PM,
> > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in this
> task.
> > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> pluggable
> > > > > > > compression
> > > > > > > > > SPI
> > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > >>>> > >>> > <https://issues.apache.org/
> > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution on
> > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue discussion of
> > > task
> > > > > > goals
> > > > > > > > and
> > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that, the
> main
> > > > goal
> > > > > of
> > > > > > > > this
> > > > > > > > > > task
> > > > > > > > > > > > is
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need from
> > Ignite
> > > as
> > > > > its
> > > > > > > > user.
> > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data on
> same
> > > > > servers
> > > > > > > at
> > > > > > > > > the
> > > > > > > > > > > cost
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> possibility
> > of
> > > > > > > > > > implementation
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> context:
> > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> developers.2346864.n4.nabble.
> > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache Ignite
> > > > > Developers
> > > > > > > > > mailing
> > > > > > > > > > > > list
> > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best Regards, Anton Churaev
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
>
>
>
> --
> Best Regards, Vyacheslav
>
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
>> which is much less useful.
I note, in some cases there is profit more than twice per size of an object.

>> Would it be possible to change your implementation to handle the
encryption instead?
Yes, of cource, there's not much difference between compression and
encryption, including in my implementation of per-field-compression.

2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> Vyacheslav,
>
> When this feature started out as data compression in Ignite, it sounded
> very useful. Now it is unfolding as a per-field compression, which is much
> less useful. In fact, it is questionable whether it is useful at all. The
> fact that this feature is implemented does not make it mandatory for the
> community to accept it.
>
> However, as I mentioned before, per-field encryption is very useful, as it
> would allow users automatically encrypt certain sensitive fields, like
> passwords, credit card numbers, etc. There is not much conceptual
> difference between compressing a field vs encrypting a field. Would it be
> possible to change your implementation to handle the encryption instead?
>
> D.
>
> On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <[hidden email]>
> wrote:
>
> > Guys, I want to be clear:
> > * "Per-field compression" design is the result of a research of the
> binary
> > infrastructure of Ignite and some other its places (querying, indexing,
> > etc.)
> > * Full-compression of object will be more effective, but in this case
> there
> > is no capability with querying and indexing (or there is large overhead
> by
> > way of decompressing of full object (or caches pages) on demand)
> > * "Per-field compression" is a one of ways to implement the compression
> > feature
> >
> > I'm new to Ignite also I can be mistaken in some things.
> > Last 3-4 month I've tryed to start dicussion about a design, but nobody
> > answers nothing (except Dmitry and Valentin who was interested how it
> > works).
> > But I understand that this is community and nobody is obliged to anybody.
> >
> > There are strong Ignite experts.
> > If they can help me and community with a design of the compression
> feature
> > it will be great.
> > At the moment I have a desire and time to be engaged in development of
> > compression feature in Ignite.
> > Let's use this opportunity :)
> >
> > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >
> > > Igniters,
> > >
> > > I have never seen a single Ignite user asking about compressing a
> single
> > > field. However, we have had requests to secure certain fields, e.g.
> > > passwords.
> > >
> > > I personally do not think per-field compression is needed, unless we
> can
> > > point out some concrete real life use cases.
> > >
> > > D.
> > >
> > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> [hidden email]>
> > > wrote:
> > >
> > > > Anton,
> > > >
> > > > >> I thought that if there will storing compressed data in the
> memory,
> > > data
> > > > >> will transmit over wire in compression too. Is it right?
> > > >
> > > > In per-field compression case - yes.
> > > >
> > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> > > >
> > > > > Guys, could you please help me.
> > > > > I thought that if there will storing compressed data in the memory,
> > > data
> > > > > will transmit over wire in compression too. Is it right?
> > > > >
> > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <[hidden email]
> >:
> > > > >
> > > > > > Vladimir,
> > > > > >
> > > > > > The main problem which I'am trying to solve is storing data in
> > memory
> > > > in
> > > > > a
> > > > > > compression form via Ignite.
> > > > > > The main goal is using memory more effectivelly.
> > > > > >
> > > > > > >> here the much simpler step would be to full
> > > > > > compression on per-cache basis rather than dealing with
> per-fields
> > > > case.
> > > > > >
> > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > Is it compatible with quering and indexing?
> > > > > >
> > > > > > >> In the end, if user would like to compress particular field,
> he
> > > can
> > > > > > always to it on his own
> > > > > > I think we mustn't think in this way, if user need something he
> > > trying
> > > > to
> > > > > > choose a tool which has this feature OOTB.
> > > > > >
> > > > > >
> > > > > >
> > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <[hidden email]
> >:
> > > > > >
> > > > > > > Igniters,
> > > > > > >
> > > > > > > Honestly I still do not see how to apply it gracefully this
> > feature
> > > > ti
> > > > > > > Ignite. And overall approach to compress only particular fields
> > > looks
> > > > > > > overcomplicated to me. Remember, that our main use case is an
> > > > > application
> > > > > > > without classes on the server. It means that any kind of
> > > annotations
> > > > > are
> > > > > > > inapplicable. To be more precise: proper API should be
> > implemented
> > > to
> > > > > > > handle no-class case (e.g. how would build such an object
> through
> > > > > > > BinaryBuilder without a class?), and only then add annotations
> as
> > > > > > > convenient addition to more basic API.
> > > > > > >
> > > > > > > It seems to me that full implementation, which takes in count
> > > proper
> > > > > > > "classless" API, changes to binary metadata to reflect
> compressed
> > > > > fields,
> > > > > > > changes to SQL, changes to binary protocol, and porting to .NET
> > and
> > > > > CPP,
> > > > > > > will yield very complex solution with little value to the
> > product.
> > > > > > >
> > > > > > > Instead, as I proposed earlier, it seems that we'd better start
> > > with
> > > > > the
> > > > > > > problem we are trying to solve. Basically, compression could
> help
> > > in
> > > > > two
> > > > > > > cases:
> > > > > > > 1) Transmitting data over wire - it should be implemented on
> > > > > > communication
> > > > > > > layer and should not affect binary serialization component a
> lot.
> > > > > > > 2) Storing data in memory - here the much simpler step would be
> > to
> > > > full
> > > > > > > compression on per-cache basis rather than dealing with
> > per-fields
> > > > > case.
> > > > > > >
> > > > > > > In the end, if user would like to compress particular field, he
> > can
> > > > > > always
> > > > > > > to it on his own, and set already compressed field to our
> > > > BinaryObject.
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > [hidden email]
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Valentin,
> > > > > > > >
> > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > >
> > > > > > > > You can see an example of Java class[3] that I used in my
> > > > benchmark.
> > > > > > > > For example:
> > > > > > > > class Foo {
> > > > > > > > @BinaryCompression
> > > > > > > > String data;
> > > > > > > > }
> > > > > > > > If user make decision to store the object in compressed form,
> > he
> > > > can
> > > > > > use
> > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > It means annotated field 'data' will be compressed at
> > > marshalling.
> > > > > > > >
> > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > [3]
> > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > model/Audit1F.java
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > >
> > > > > > > > > Vyacheslav, Anton,
> > > > > > > > >
> > > > > > > > > Are there any ideas and/or prototypes for the API? Your
> > design
> > > > > > > > suggestions
> > > > > > > > > seem to make sense, but I would like to see how it all this
> > > will
> > > > > like
> > > > > > > > from
> > > > > > > > > user's standpoint.
> > > > > > > > >
> > > > > > > > > -Val
> > > > > > > > >
> > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > [hidden email]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > >
> > > > > > > > > > We could provide opportunity of choose between CPU usage
> > and
> > > > > > MEM/NET
> > > > > > > > > usage
> > > > > > > > > > for users by compression some attributes of stored
> objects.
> > > > > > > > > > You have learned design, and it is possible to localize
> > > changes
> > > > > in
> > > > > > > > > > marshalling without performance affect and current
> > > > functionality.
> > > > > > > > > >
> > > > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > In short,
> > > > > > > > > > >
> > > > > > > > > > > During marshalling a fields is represented as
> > > > > BinaryFieldAccessor
> > > > > > > > which
> > > > > > > > > > > manages its marshalling. It checks if the field is
> marked
> > > by
> > > > > > > > annotation
> > > > > > > > > > > @BinaryCompression, in that case - binary
> representation
> > > of
> > > > > > field
> > > > > > > > > (bytes
> > > > > > > > > > > array) will be compressed. It will be marked as
> > compressed
> > > by
> > > > > > types
> > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after this
> > the
> > > > > > > > compressed
> > > > > > > > > > > bytes
> > > > > > > > > > > array wiil be include in binary representation of whole
> > > > object.
> > > > > > > Note,
> > > > > > > > > > > header of marshalled object will not be compressed.
> > > > Compression
> > > > > > > > > affected
> > > > > > > > > > > only object's field representation.
> > > > > > > > > > >
> > > > > > > > > > > Objects in IgniteCache is represented as BinaryObject
> > which
> > > > is
> > > > > > > > wrapper
> > > > > > > > > > over
> > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > BinaryObject provides some usefull methods, which are
> > used
> > > by
> > > > > > > Ignite
> > > > > > > > > > > systems.
> > > > > > > > > > > For example, the Queries use BinaryObject#field method,
> > > which
> > > > > > > > > > deserializes
> > > > > > > > > > > only field of object, without deserializing of whole
> > > object.
> > > > > > > > > > > BinaryObject#field method during deserialization, if
> > meets
> > > > the
> > > > > > > > constant
> > > > > > > > > > of
> > > > > > > > > > > compressed type, decompress this bytes array, then
> > continue
> > > > > > > > > unmarshalling
> > > > > > > > > > > as usual.
> > > > > > > > > > >
> > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > IgniteConfigurations,
> > > > > > > > it
> > > > > > > > > > > allows user to use own implementation of compressor -
> it
> > is
> > > > the
> > > > > > > > > > requirement
> > > > > > > > > > > in the task[1].
> > > > > > > > > > >
> > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the idea
> > of
> > > > > > granting
> > > > > > > > > this
> > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > In that case we can choose a compression algorithm
> which
> > we
> > > > > will
> > > > > > > > > provide
> > > > > > > > > > by
> > > > > > > > > > > default and will move the interface to internals of
> > binary
> > > > > > > > > > infractructure.
> > > > > > > > > > > For this case I've prepared benchmarked, which I've
> sent
> > > > > earlier.
> > > > > > > > > > >
> > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > compression
> > > > > ratio
> > > > > > > and
> > > > > > > > > good
> > > > > > > > > > > throughput. It has implementation in Java, .NET and
> C++,
> > > and
> > > > > has
> > > > > > > > > > > ASF-friendly license, we can use it in the all Ignite
> > > > > platforms.
> > > > > > > > > > > You can look at an assessment of this algorithm in my
> > > > > benchmark's
> > > > > > > > > > >
> > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-3592
> > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > [hidden email]
> > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > >
> > > > > > > > > > > > Could You propose design of implementation in couple
> of
> > > > > > > sentences?
> > > > > > > > > > > > So that we can estimate the completeness and
> complexity
> > > of
> > > > > the
> > > > > > > > > > proposal.
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Anton,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Of course, the solution does not affect on existing
> > > > > > > > > implementation. I
> > > > > > > > > > > > mean,
> > > > > > > > > > > > > there is no changes if user not use the annotation
> > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > (no
> > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > Only if user make decision to use compression on
> > > specific
> > > > > > field
> > > > > > > > or
> > > > > > > > > > > fields
> > > > > > > > > > > > > of a class - in that case compression will be used
> at
> > > > > > > marshalling
> > > > > > > > > in
> > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Is it possible to propose implementation that can
> > be
> > > > > > switched
> > > > > > > > on
> > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > In this case it should not affect performance of
> > > > current
> > > > > > > > > solution.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I mean, that users should make decision what is
> > more
> > > > > > > important
> > > > > > > > > for
> > > > > > > > > > > > them:
> > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > May be they will be choose not all objects, or
> only
> > > > some
> > > > > > > > > attributes
> > > > > > > > > > > of
> > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > Provided solution allows reduce size of an
> object
> > > in
> > > > > > > > > IgniteCache
> > > > > > > > > > at
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > cost of throughput reduction (small - in some
> > > cases),
> > > > > it
> > > > > > > > > depends
> > > > > > > > > > on
> > > > > > > > > > > > > part
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > object which will be compressed and compression
> > > > > > algorithm.
> > > > > > > > > > > > > > > I mean, we can make more effective use of
> memory,
> > > and
> > > > > in
> > > > > > > some
> > > > > > > > > > cases
> > > > > > > > > > > > it
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > reduce loading of the interconnect.
> (replication,
> > > > > > > > rebalancing)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Especially, it will be particularly useful for
> > > > object's
> > > > > > > > fields
> > > > > > > > > > > which
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> effectively
> > > > > > > compressed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you please
> > > > provide a
> > > > > > > > > > conclusions
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> Daradur <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows object
> > > size,
> > > > > with
> > > > > > > > > > > compression
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > without compression. (Conditions: literal
> > text)
> > > > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > > > different
> > > > > > > > > > > compression
> > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> objects
> > > > > > depending
> > > > > > > > on
> > > > > > > > > > > sizes
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows object
> > > size,
> > > > > with
> > > > > > > > > > > compression
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > without compression. (Conditions:  badly
> > > > compressed
> > > > > > > > > character
> > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > 1st graph shows compression ratios of using
> > > > > different
> > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > algrithms depending on size of compressed
> > > field.
> > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> objects
> > > > > > depending
> > > > > > > > on
> > > > > > > > > > > sizes
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the
> > "put"
> > > > > > > operation
> > > > > > > > > > > > depending
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the
> > "put"
> > > > > > > operation
> > > > > > > > > > > > depending
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the
> > "get"
> > > > > > > operation
> > > > > > > > > > > > depending
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the
> > "get"
> > > > > > > operation
> > > > > > > > > > > > depending
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> Setrakyan
> > <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to interpret
> > the
> > > > > > graphs?
> > > > > > > > What
> > > > > > > > > > are
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> Vyacheslav
> > > > > > Daradur <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> Results
> > > [1].
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > And I've prepared the evaluation in the
> > > form
> > > > of
> > > > > > > > > diagrams
> > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > community
> > > > and
> > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav
> > > Daradur
> > > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00 Vyacheslav
> > > > > Daradur <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show my
> idea.
> > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> About querying - I've just copied
> > > existing
> > > > > > tests
> > > > > > > > and
> > > > > > > > > > > have
> > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> It means fields which will be marked
> > by
> > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > existing
> > > > data
> > > > > > or
> > > > > > > > > > project
> > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I want to
> > show
> > > > it.
> > > > > > > > > > > > > > > > > > > >>> It is always easier to discuss on
> > > > example.
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00 Dmitriy
> > > > > Setrakyan
> > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > >>>> I think it is a bit premature to
> > > > provide a
> > > > > > PR
> > > > > > > > > > without
> > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list. Please
> > > allow
> > > > > some
> > > > > > > > time
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> > > > > Vyacheslav
> > > > > > > > > Daradur
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with described
> > > > > solution
> > > > > > in
> > > > > > > > > > couple
> > > > > > > > > > > of
> > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the discussion
> > > about
> > > > a
> > > > > > > > > > compression
> > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found only
> one
> > > > > solution
> > > > > > > > which
> > > > > > > > > > is
> > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > per-objects-field
> > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression means
> > that
> > > > > > metadata
> > > > > > > > (a
> > > > > > > > > > > > header)
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only serialized
> > > > values
> > > > > of
> > > > > > > an
> > > > > > > > > > object
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > contentious
> > > > > > > issues:
> > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> primitives
> > > and
> > > > > > short
> > > > > > > > > > arrays -
> > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible to use
> > > > > > compression
> > > > > > > > with
> > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > We can provide an annotation,
> > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > >>>> > > be used by users for marking
> > > fields
> > > > to
> > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have
> ready
> > > > > design?
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> public
> > > API
> > > > > > > design.
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add some a
> > > > > configure
> > > > > > > > > entity
> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > Compressor
> > > > > > > interface
> > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > BinaryMarshaller
> > > > > > > > > > decorator,
> > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00
> > Alexey
> > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > discussion
> > > > [1]
> > > > > > > about
> > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we
> agreed
> > > to
> > > > > add
> > > > > > > only
> > > > > > > > > > some
> > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject some
> > sort
> > > > of
> > > > > > > custom
> > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > http://apache-ignite-developer
> > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> > > > > td10099.html
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at 2:19
> > PM,
> > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in this
> > task.
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > pluggable
> > > > > > > > compression
> > > > > > > > > > SPI
> > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> https://issues.apache.org/
> > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution on
> > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue discussion
> of
> > > > task
> > > > > > > goals
> > > > > > > > > and
> > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that, the
> > main
> > > > > goal
> > > > > > of
> > > > > > > > > this
> > > > > > > > > > > task
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need from
> > > Ignite
> > > > as
> > > > > > its
> > > > > > > > > user.
> > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data on
> > same
> > > > > > servers
> > > > > > > > at
> > > > > > > > > > the
> > > > > > > > > > > > cost
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > possibility
> > > of
> > > > > > > > > > > implementation
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > context:
> > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> Ignite
> > > > > > Developers
> > > > > > > > > > mailing
> > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > >
> > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Vyacheslav
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best Regards, Anton Churaev
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav
> >
>



--
Best Regards, Vyacheslav
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Vladimir Ozerov
Dima,

Encryption of certain fields is as bad as compression. First, it is a huge
change, which makes already complex binary protocol even more complex.
Second, it have to be ported to CPP, .NET platforms, as well as to JDBC and
ODBC.
Last, but the most important - this is not our headache to encrypt
sensitive data. This is user responsibility. Nobody in a sane mind will
store passwords in plain form. Instead, user should encrypt it on his own,
choosing proper encryption parameters - algorithms, key lengths, salts,
etc.. How are you going to expose this in API or configuration?

We should not implement data encryption on binary level, this is out of
question. Encryption should be implemented on application level (user
efforts), transport layer (SSL - we already have it), and possibly on
disk-level (there are tools for this already).


On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
wrote:

> >> which is much less useful.
> I note, in some cases there is profit more than twice per size of an
> object.
>
> >> Would it be possible to change your implementation to handle the
> encryption instead?
> Yes, of cource, there's not much difference between compression and
> encryption, including in my implementation of per-field-compression.
>
> 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
> > Vyacheslav,
> >
> > When this feature started out as data compression in Ignite, it sounded
> > very useful. Now it is unfolding as a per-field compression, which is
> much
> > less useful. In fact, it is questionable whether it is useful at all. The
> > fact that this feature is implemented does not make it mandatory for the
> > community to accept it.
> >
> > However, as I mentioned before, per-field encryption is very useful, as
> it
> > would allow users automatically encrypt certain sensitive fields, like
> > passwords, credit card numbers, etc. There is not much conceptual
> > difference between compressing a field vs encrypting a field. Would it be
> > possible to change your implementation to handle the encryption instead?
> >
> > D.
> >
> > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <[hidden email]
> >
> > wrote:
> >
> > > Guys, I want to be clear:
> > > * "Per-field compression" design is the result of a research of the
> > binary
> > > infrastructure of Ignite and some other its places (querying, indexing,
> > > etc.)
> > > * Full-compression of object will be more effective, but in this case
> > there
> > > is no capability with querying and indexing (or there is large overhead
> > by
> > > way of decompressing of full object (or caches pages) on demand)
> > > * "Per-field compression" is a one of ways to implement the compression
> > > feature
> > >
> > > I'm new to Ignite also I can be mistaken in some things.
> > > Last 3-4 month I've tryed to start dicussion about a design, but nobody
> > > answers nothing (except Dmitry and Valentin who was interested how it
> > > works).
> > > But I understand that this is community and nobody is obliged to
> anybody.
> > >
> > > There are strong Ignite experts.
> > > If they can help me and community with a design of the compression
> > feature
> > > it will be great.
> > > At the moment I have a desire and time to be engaged in development of
> > > compression feature in Ignite.
> > > Let's use this opportunity :)
> > >
> > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > Igniters,
> > > >
> > > > I have never seen a single Ignite user asking about compressing a
> > single
> > > > field. However, we have had requests to secure certain fields, e.g.
> > > > passwords.
> > > >
> > > > I personally do not think per-field compression is needed, unless we
> > can
> > > > point out some concrete real life use cases.
> > > >
> > > > D.
> > > >
> > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Anton,
> > > > >
> > > > > >> I thought that if there will storing compressed data in the
> > memory,
> > > > data
> > > > > >> will transmit over wire in compression too. Is it right?
> > > > >
> > > > > In per-field compression case - yes.
> > > > >
> > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> > > > >
> > > > > > Guys, could you please help me.
> > > > > > I thought that if there will storing compressed data in the
> memory,
> > > > data
> > > > > > will transmit over wire in compression too. Is it right?
> > > > > >
> > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> [hidden email]
> > >:
> > > > > >
> > > > > > > Vladimir,
> > > > > > >
> > > > > > > The main problem which I'am trying to solve is storing data in
> > > memory
> > > > > in
> > > > > > a
> > > > > > > compression form via Ignite.
> > > > > > > The main goal is using memory more effectivelly.
> > > > > > >
> > > > > > > >> here the much simpler step would be to full
> > > > > > > compression on per-cache basis rather than dealing with
> > per-fields
> > > > > case.
> > > > > > >
> > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > Is it compatible with quering and indexing?
> > > > > > >
> > > > > > > >> In the end, if user would like to compress particular field,
> > he
> > > > can
> > > > > > > always to it on his own
> > > > > > > I think we mustn't think in this way, if user need something he
> > > > trying
> > > > > to
> > > > > > > choose a tool which has this feature OOTB.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> [hidden email]
> > >:
> > > > > > >
> > > > > > > > Igniters,
> > > > > > > >
> > > > > > > > Honestly I still do not see how to apply it gracefully this
> > > feature
> > > > > ti
> > > > > > > > Ignite. And overall approach to compress only particular
> fields
> > > > looks
> > > > > > > > overcomplicated to me. Remember, that our main use case is an
> > > > > > application
> > > > > > > > without classes on the server. It means that any kind of
> > > > annotations
> > > > > > are
> > > > > > > > inapplicable. To be more precise: proper API should be
> > > implemented
> > > > to
> > > > > > > > handle no-class case (e.g. how would build such an object
> > through
> > > > > > > > BinaryBuilder without a class?), and only then add
> annotations
> > as
> > > > > > > > convenient addition to more basic API.
> > > > > > > >
> > > > > > > > It seems to me that full implementation, which takes in count
> > > > proper
> > > > > > > > "classless" API, changes to binary metadata to reflect
> > compressed
> > > > > > fields,
> > > > > > > > changes to SQL, changes to binary protocol, and porting to
> .NET
> > > and
> > > > > > CPP,
> > > > > > > > will yield very complex solution with little value to the
> > > product.
> > > > > > > >
> > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> start
> > > > with
> > > > > > the
> > > > > > > > problem we are trying to solve. Basically, compression could
> > help
> > > > in
> > > > > > two
> > > > > > > > cases:
> > > > > > > > 1) Transmitting data over wire - it should be implemented on
> > > > > > > communication
> > > > > > > > layer and should not affect binary serialization component a
> > lot.
> > > > > > > > 2) Storing data in memory - here the much simpler step would
> be
> > > to
> > > > > full
> > > > > > > > compression on per-cache basis rather than dealing with
> > > per-fields
> > > > > > case.
> > > > > > > >
> > > > > > > > In the end, if user would like to compress particular field,
> he
> > > can
> > > > > > > always
> > > > > > > > to it on his own, and set already compressed field to our
> > > > > BinaryObject.
> > > > > > > >
> > > > > > > > Vladimir.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Valentin,
> > > > > > > > >
> > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > >
> > > > > > > > > You can see an example of Java class[3] that I used in my
> > > > > benchmark.
> > > > > > > > > For example:
> > > > > > > > > class Foo {
> > > > > > > > > @BinaryCompression
> > > > > > > > > String data;
> > > > > > > > > }
> > > > > > > > > If user make decision to store the object in compressed
> form,
> > > he
> > > > > can
> > > > > > > use
> > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > It means annotated field 'data' will be compressed at
> > > > marshalling.
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > [3]
> > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > model/Audit1F.java
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > >
> > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > >
> > > > > > > > > > Are there any ideas and/or prototypes for the API? Your
> > > design
> > > > > > > > > suggestions
> > > > > > > > > > seem to make sense, but I would like to see how it all
> this
> > > > will
> > > > > > like
> > > > > > > > > from
> > > > > > > > > > user's standpoint.
> > > > > > > > > >
> > > > > > > > > > -Val
> > > > > > > > > >
> > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > >
> > > > > > > > > > > We could provide opportunity of choose between CPU
> usage
> > > and
> > > > > > > MEM/NET
> > > > > > > > > > usage
> > > > > > > > > > > for users by compression some attributes of stored
> > objects.
> > > > > > > > > > > You have learned design, and it is possible to localize
> > > > changes
> > > > > > in
> > > > > > > > > > > marshalling without performance affect and current
> > > > > functionality.
> > > > > > > > > > >
> > > > > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > In short,
> > > > > > > > > > > >
> > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > BinaryFieldAccessor
> > > > > > > > > which
> > > > > > > > > > > > manages its marshalling. It checks if the field is
> > marked
> > > > by
> > > > > > > > > annotation
> > > > > > > > > > > > @BinaryCompression, in that case - binary
> > representation
> > > > of
> > > > > > > field
> > > > > > > > > > (bytes
> > > > > > > > > > > > array) will be compressed. It will be marked as
> > > compressed
> > > > by
> > > > > > > types
> > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after
> this
> > > the
> > > > > > > > > compressed
> > > > > > > > > > > > bytes
> > > > > > > > > > > > array wiil be include in binary representation of
> whole
> > > > > object.
> > > > > > > > Note,
> > > > > > > > > > > > header of marshalled object will not be compressed.
> > > > > Compression
> > > > > > > > > > affected
> > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > >
> > > > > > > > > > > > Objects in IgniteCache is represented as BinaryObject
> > > which
> > > > > is
> > > > > > > > > wrapper
> > > > > > > > > > > over
> > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > BinaryObject provides some usefull methods, which are
> > > used
> > > > by
> > > > > > > > Ignite
> > > > > > > > > > > > systems.
> > > > > > > > > > > > For example, the Queries use BinaryObject#field
> method,
> > > > which
> > > > > > > > > > > deserializes
> > > > > > > > > > > > only field of object, without deserializing of whole
> > > > object.
> > > > > > > > > > > > BinaryObject#field method during deserialization, if
> > > meets
> > > > > the
> > > > > > > > > constant
> > > > > > > > > > > of
> > > > > > > > > > > > compressed type, decompress this bytes array, then
> > > continue
> > > > > > > > > > unmarshalling
> > > > > > > > > > > > as usual.
> > > > > > > > > > > >
> > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > IgniteConfigurations,
> > > > > > > > > it
> > > > > > > > > > > > allows user to use own implementation of compressor -
> > it
> > > is
> > > > > the
> > > > > > > > > > > requirement
> > > > > > > > > > > > in the task[1].
> > > > > > > > > > > >
> > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the
> idea
> > > of
> > > > > > > granting
> > > > > > > > > > this
> > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > In that case we can choose a compression algorithm
> > which
> > > we
> > > > > > will
> > > > > > > > > > provide
> > > > > > > > > > > by
> > > > > > > > > > > > default and will move the interface to internals of
> > > binary
> > > > > > > > > > > infractructure.
> > > > > > > > > > > > For this case I've prepared benchmarked, which I've
> > sent
> > > > > > earlier.
> > > > > > > > > > > >
> > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > compression
> > > > > > ratio
> > > > > > > > and
> > > > > > > > > > good
> > > > > > > > > > > > throughput. It has implementation in Java, .NET and
> > C++,
> > > > and
> > > > > > has
> > > > > > > > > > > > ASF-friendly license, we can use it in the all Ignite
> > > > > > platforms.
> > > > > > > > > > > > You can look at an assessment of this algorithm in my
> > > > > > benchmark's
> > > > > > > > > > > >
> > > > > > > > > > > > [1] https://issues.apache.org/
> jira/browse/IGNITE-3592
> > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > [hidden email]
> > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Could You propose design of implementation in
> couple
> > of
> > > > > > > > sentences?
> > > > > > > > > > > > > So that we can estimate the completeness and
> > complexity
> > > > of
> > > > > > the
> > > > > > > > > > > proposal.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Of course, the solution does not affect on
> existing
> > > > > > > > > > implementation. I
> > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > there is no changes if user not use the
> annotation
> > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > (no
> > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > Only if user make decision to use compression on
> > > > specific
> > > > > > > field
> > > > > > > > > or
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > of a class - in that case compression will be
> used
> > at
> > > > > > > > marshalling
> > > > > > > > > > in
> > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Is it possible to propose implementation that
> can
> > > be
> > > > > > > switched
> > > > > > > > > on
> > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > In this case it should not affect performance
> of
> > > > > current
> > > > > > > > > > solution.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I mean, that users should make decision what is
> > > more
> > > > > > > > important
> > > > > > > > > > for
> > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > May be they will be choose not all objects, or
> > only
> > > > > some
> > > > > > > > > > attributes
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > Provided solution allows reduce size of an
> > object
> > > > in
> > > > > > > > > > IgniteCache
> > > > > > > > > > > at
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > cost of throughput reduction (small - in some
> > > > cases),
> > > > > > it
> > > > > > > > > > depends
> > > > > > > > > > > on
> > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > object which will be compressed and
> compression
> > > > > > > algorithm.
> > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > memory,
> > > > and
> > > > > > in
> > > > > > > > some
> > > > > > > > > > > cases
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > (replication,
> > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Especially, it will be particularly useful
> for
> > > > > object's
> > > > > > > > > fields
> > > > > > > > > > > > which
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > effectively
> > > > > > > > compressed.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you please
> > > > > provide a
> > > > > > > > > > > conclusions
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> object
> > > > size,
> > > > > > with
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > without compression. (Conditions: literal
> > > text)
> > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> using
> > > > > > different
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > objects
> > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > sizes
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> object
> > > > size,
> > > > > > with
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > without compression. (Conditions:  badly
> > > > > compressed
> > > > > > > > > > character
> > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> using
> > > > > > different
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > algrithms depending on size of compressed
> > > > field.
> > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > objects
> > > > > > > depending
> > > > > > > > > on
> > > > > > > > > > > > sizes
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of the
> > > "put"
> > > > > > > > operation
> > > > > > > > > > > > > depending
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of the
> > > "put"
> > > > > > > > operation
> > > > > > > > > > > > > depending
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of the
> > > "get"
> > > > > > > > operation
> > > > > > > > > > > > > depending
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of the
> > > "get"
> > > > > > > > operation
> > > > > > > > > > > > > depending
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > Setrakyan
> > > <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> interpret
> > > the
> > > > > > > graphs?
> > > > > > > > > What
> > > > > > > > > > > are
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > Vyacheslav
> > > > > > > Daradur <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > Results
> > > > [1].
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation in
> the
> > > > form
> > > > > of
> > > > > > > > > > diagrams
> > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > community
> > > > > and
> > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/d/
> > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00 Vyacheslav
> > > > Daradur
> > > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> Vyacheslav
> > > > > > Daradur <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show my
> > idea.
> > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> About querying - I've just copied
> > > > existing
> > > > > > > tests
> > > > > > > > > and
> > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> marked
> > > by
> > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > existing
> > > > > data
> > > > > > > or
> > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your thougths.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I want to
> > > show
> > > > > it.
> > > > > > > > > > > > > > > > > > > > >>> It is always easier to discuss on
> > > > > example.
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> Dmitriy
> > > > > > Setrakyan
> > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit premature to
> > > > > provide a
> > > > > > > PR
> > > > > > > > > > > without
> > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> Please
> > > > allow
> > > > > > some
> > > > > > > > > time
> > > > > > > > > > > for
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36 AM,
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> described
> > > > > > solution
> > > > > > > in
> > > > > > > > > > > couple
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> discussion
> > > > about
> > > > > a
> > > > > > > > > > > compression
> > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found only
> > one
> > > > > > solution
> > > > > > > > > which
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > per-objects-field
> > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression means
> > > that
> > > > > > > metadata
> > > > > > > > > (a
> > > > > > > > > > > > > header)
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> serialized
> > > > > values
> > > > > > of
> > > > > > > > an
> > > > > > > > > > > object
> > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > contentious
> > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > primitives
> > > > and
> > > > > > > short
> > > > > > > > > > > arrays -
> > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible to
> use
> > > > > > > compression
> > > > > > > > > with
> > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> annotation,
> > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for marking
> > > > fields
> > > > > to
> > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have
> > ready
> > > > > > design?
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > > > > Vyacheslav
> > > > > > > > > Daradur
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > public
> > > > API
> > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> some a
> > > > > > configure
> > > > > > > > > > entity
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > Compressor
> > > > > > > > interface
> > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > BinaryMarshaller
> > > > > > > > > > > decorator,
> > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40 GMT+03:00
> > > Alexey
> > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > discussion
> > > > > [1]
> > > > > > > > about
> > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we
> > agreed
> > > > to
> > > > > > add
> > > > > > > > only
> > > > > > > > > > > some
> > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> some
> > > sort
> > > > > of
> > > > > > > > custom
> > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > http://apache-ignite-developer
> > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > >>>> > >>> ompression-in-Ignite-2-0-
> > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at
> 2:19
> > > PM,
> > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in this
> > > task.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > pluggable
> > > > > > > > > compression
> > > > > > > > > > > SPI
> > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > https://issues.apache.org/
> > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution
> on
> > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> discussion
> > of
> > > > > task
> > > > > > > > goals
> > > > > > > > > > and
> > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that,
> the
> > > main
> > > > > > goal
> > > > > > > of
> > > > > > > > > > this
> > > > > > > > > > > > task
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need from
> > > > Ignite
> > > > > as
> > > > > > > its
> > > > > > > > > > user.
> > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data
> on
> > > same
> > > > > > > servers
> > > > > > > > > at
> > > > > > > > > > > the
> > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > possibility
> > > > of
> > > > > > > > > > > > implementation
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > context:
> > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > Ignite
> > > > > > > Developers
> > > > > > > > > > > mailing
> > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > >
> > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best Regards, Anton Churaev
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
>
>
>
> --
> Best Regards, Vyacheslav
>
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Vladimir,

>>  Nobody in a sane mind will
>> store passwords in plain form. Instead, user should encrypt it on his
own,
>> choosing proper encryption parameters - algorithms, key lengths, salts,
etc..
Sounds reasonable to me.
But if someone want to have this feature OOTB we can continue discussion,
may implement it in some other way.

>> How are you going to expose this in API or configuration?
Just for example: we can provide a plugable interface in the
IgniteConfiguration (or other place), which user will able to implement.

About compression, you wrote:
>> 2) Storing data in memory - here the much simpler step would be to full
>> compression on per-cache basis rather than dealing with per-fields case.
Could you explain your idea? How we can implement it and how it will able
to be compatible with querying and indexing?
Thanks in advance.


2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Dima,
>
> Encryption of certain fields is as bad as compression. First, it is a huge
> change, which makes already complex binary protocol even more complex.
> Second, it have to be ported to CPP, .NET platforms, as well as to JDBC and
> ODBC.
> Last, but the most important - this is not our headache to encrypt
> sensitive data. This is user responsibility. Nobody in a sane mind will
> store passwords in plain form. Instead, user should encrypt it on his own,
> choosing proper encryption parameters - algorithms, key lengths, salts,
> etc.. How are you going to expose this in API or configuration?
>
> We should not implement data encryption on binary level, this is out of
> question. Encryption should be implemented on application level (user
> efforts), transport layer (SSL - we already have it), and possibly on
> disk-level (there are tools for this already).
>
>
> On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
> wrote:
>
> > >> which is much less useful.
> > I note, in some cases there is profit more than twice per size of an
> > object.
> >
> > >> Would it be possible to change your implementation to handle the
> > encryption instead?
> > Yes, of cource, there's not much difference between compression and
> > encryption, including in my implementation of per-field-compression.
> >
> > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >
> > > Vyacheslav,
> > >
> > > When this feature started out as data compression in Ignite, it sounded
> > > very useful. Now it is unfolding as a per-field compression, which is
> > much
> > > less useful. In fact, it is questionable whether it is useful at all.
> The
> > > fact that this feature is implemented does not make it mandatory for
> the
> > > community to accept it.
> > >
> > > However, as I mentioned before, per-field encryption is very useful, as
> > it
> > > would allow users automatically encrypt certain sensitive fields, like
> > > passwords, credit card numbers, etc. There is not much conceptual
> > > difference between compressing a field vs encrypting a field. Would it
> be
> > > possible to change your implementation to handle the encryption
> instead?
> > >
> > > D.
> > >
> > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> [hidden email]
> > >
> > > wrote:
> > >
> > > > Guys, I want to be clear:
> > > > * "Per-field compression" design is the result of a research of the
> > > binary
> > > > infrastructure of Ignite and some other its places (querying,
> indexing,
> > > > etc.)
> > > > * Full-compression of object will be more effective, but in this case
> > > there
> > > > is no capability with querying and indexing (or there is large
> overhead
> > > by
> > > > way of decompressing of full object (or caches pages) on demand)
> > > > * "Per-field compression" is a one of ways to implement the
> compression
> > > > feature
> > > >
> > > > I'm new to Ignite also I can be mistaken in some things.
> > > > Last 3-4 month I've tryed to start dicussion about a design, but
> nobody
> > > > answers nothing (except Dmitry and Valentin who was interested how it
> > > > works).
> > > > But I understand that this is community and nobody is obliged to
> > anybody.
> > > >
> > > > There are strong Ignite experts.
> > > > If they can help me and community with a design of the compression
> > > feature
> > > > it will be great.
> > > > At the moment I have a desire and time to be engaged in development
> of
> > > > compression feature in Ignite.
> > > > Let's use this opportunity :)
> > > >
> > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > > >
> > > > > Igniters,
> > > > >
> > > > > I have never seen a single Ignite user asking about compressing a
> > > single
> > > > > field. However, we have had requests to secure certain fields, e.g.
> > > > > passwords.
> > > > >
> > > > > I personally do not think per-field compression is needed, unless
> we
> > > can
> > > > > point out some concrete real life use cases.
> > > > >
> > > > > D.
> > > > >
> > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Anton,
> > > > > >
> > > > > > >> I thought that if there will storing compressed data in the
> > > memory,
> > > > > data
> > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > >
> > > > > > In per-field compression case - yes.
> > > > > >
> > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> > > > > >
> > > > > > > Guys, could you please help me.
> > > > > > > I thought that if there will storing compressed data in the
> > memory,
> > > > > data
> > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > >
> > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > [hidden email]
> > > >:
> > > > > > >
> > > > > > > > Vladimir,
> > > > > > > >
> > > > > > > > The main problem which I'am trying to solve is storing data
> in
> > > > memory
> > > > > > in
> > > > > > > a
> > > > > > > > compression form via Ignite.
> > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > >
> > > > > > > > >> here the much simpler step would be to full
> > > > > > > > compression on per-cache basis rather than dealing with
> > > per-fields
> > > > > > case.
> > > > > > > >
> > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > Is it compatible with quering and indexing?
> > > > > > > >
> > > > > > > > >> In the end, if user would like to compress particular
> field,
> > > he
> > > > > can
> > > > > > > > always to it on his own
> > > > > > > > I think we mustn't think in this way, if user need something
> he
> > > > > trying
> > > > > > to
> > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > [hidden email]
> > > >:
> > > > > > > >
> > > > > > > > > Igniters,
> > > > > > > > >
> > > > > > > > > Honestly I still do not see how to apply it gracefully this
> > > > feature
> > > > > > ti
> > > > > > > > > Ignite. And overall approach to compress only particular
> > fields
> > > > > looks
> > > > > > > > > overcomplicated to me. Remember, that our main use case is
> an
> > > > > > > application
> > > > > > > > > without classes on the server. It means that any kind of
> > > > > annotations
> > > > > > > are
> > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > implemented
> > > > > to
> > > > > > > > > handle no-class case (e.g. how would build such an object
> > > through
> > > > > > > > > BinaryBuilder without a class?), and only then add
> > annotations
> > > as
> > > > > > > > > convenient addition to more basic API.
> > > > > > > > >
> > > > > > > > > It seems to me that full implementation, which takes in
> count
> > > > > proper
> > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > compressed
> > > > > > > fields,
> > > > > > > > > changes to SQL, changes to binary protocol, and porting to
> > .NET
> > > > and
> > > > > > > CPP,
> > > > > > > > > will yield very complex solution with little value to the
> > > > product.
> > > > > > > > >
> > > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> > start
> > > > > with
> > > > > > > the
> > > > > > > > > problem we are trying to solve. Basically, compression
> could
> > > help
> > > > > in
> > > > > > > two
> > > > > > > > > cases:
> > > > > > > > > 1) Transmitting data over wire - it should be implemented
> on
> > > > > > > > communication
> > > > > > > > > layer and should not affect binary serialization component
> a
> > > lot.
> > > > > > > > > 2) Storing data in memory - here the much simpler step
> would
> > be
> > > > to
> > > > > > full
> > > > > > > > > compression on per-cache basis rather than dealing with
> > > > per-fields
> > > > > > > case.
> > > > > > > > >
> > > > > > > > > In the end, if user would like to compress particular
> field,
> > he
> > > > can
> > > > > > > > always
> > > > > > > > > to it on his own, and set already compressed field to our
> > > > > > BinaryObject.
> > > > > > > > >
> > > > > > > > > Vladimir.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Valentin,
> > > > > > > > > >
> > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > >
> > > > > > > > > > You can see an example of Java class[3] that I used in my
> > > > > > benchmark.
> > > > > > > > > > For example:
> > > > > > > > > > class Foo {
> > > > > > > > > > @BinaryCompression
> > > > > > > > > > String data;
> > > > > > > > > > }
> > > > > > > > > > If user make decision to store the object in compressed
> > form,
> > > > he
> > > > > > can
> > > > > > > > use
> > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > marshalling.
> > > > > > > > > >
> > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > > [3]
> > > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > > model/Audit1F.java
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > >
> > > > > > > > > > > Are there any ideas and/or prototypes for the API? Your
> > > > design
> > > > > > > > > > suggestions
> > > > > > > > > > > seem to make sense, but I would like to see how it all
> > this
> > > > > will
> > > > > > > like
> > > > > > > > > > from
> > > > > > > > > > > user's standpoint.
> > > > > > > > > > >
> > > > > > > > > > > -Val
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > >
> > > > > > > > > > > > We could provide opportunity of choose between CPU
> > usage
> > > > and
> > > > > > > > MEM/NET
> > > > > > > > > > > usage
> > > > > > > > > > > > for users by compression some attributes of stored
> > > objects.
> > > > > > > > > > > > You have learned design, and it is possible to
> localize
> > > > > changes
> > > > > > > in
> > > > > > > > > > > > marshalling without performance affect and current
> > > > > > functionality.
> > > > > > > > > > > >
> > > > > > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > In short,
> > > > > > > > > > > > >
> > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > BinaryFieldAccessor
> > > > > > > > > > which
> > > > > > > > > > > > > manages its marshalling. It checks if the field is
> > > marked
> > > > > by
> > > > > > > > > > annotation
> > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > representation
> > > > > of
> > > > > > > > field
> > > > > > > > > > > (bytes
> > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > compressed
> > > > > by
> > > > > > > > types
> > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after
> > this
> > > > the
> > > > > > > > > > compressed
> > > > > > > > > > > > > bytes
> > > > > > > > > > > > > array wiil be include in binary representation of
> > whole
> > > > > > object.
> > > > > > > > > Note,
> > > > > > > > > > > > > header of marshalled object will not be compressed.
> > > > > > Compression
> > > > > > > > > > > affected
> > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Objects in IgniteCache is represented as
> BinaryObject
> > > > which
> > > > > > is
> > > > > > > > > > wrapper
> > > > > > > > > > > > over
> > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > BinaryObject provides some usefull methods, which
> are
> > > > used
> > > > > by
> > > > > > > > > Ignite
> > > > > > > > > > > > > systems.
> > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > method,
> > > > > which
> > > > > > > > > > > > deserializes
> > > > > > > > > > > > > only field of object, without deserializing of
> whole
> > > > > object.
> > > > > > > > > > > > > BinaryObject#field method during deserialization,
> if
> > > > meets
> > > > > > the
> > > > > > > > > > constant
> > > > > > > > > > > > of
> > > > > > > > > > > > > compressed type, decompress this bytes array, then
> > > > continue
> > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > as usual.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > IgniteConfigurations,
> > > > > > > > > > it
> > > > > > > > > > > > > allows user to use own implementation of
> compressor -
> > > it
> > > > is
> > > > > > the
> > > > > > > > > > > > requirement
> > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > >
> > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the
> > idea
> > > > of
> > > > > > > > granting
> > > > > > > > > > > this
> > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > In that case we can choose a compression algorithm
> > > which
> > > > we
> > > > > > > will
> > > > > > > > > > > provide
> > > > > > > > > > > > by
> > > > > > > > > > > > > default and will move the interface to internals of
> > > > binary
> > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > For this case I've prepared benchmarked, which I've
> > > sent
> > > > > > > earlier.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > compression
> > > > > > > ratio
> > > > > > > > > and
> > > > > > > > > > > good
> > > > > > > > > > > > > throughput. It has implementation in Java, .NET and
> > > C++,
> > > > > and
> > > > > > > has
> > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> Ignite
> > > > > > > platforms.
> > > > > > > > > > > > > You can look at an assessment of this algorithm in
> my
> > > > > > > benchmark's
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] https://issues.apache.org/
> > jira/browse/IGNITE-3592
> > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Could You propose design of implementation in
> > couple
> > > of
> > > > > > > > > sentences?
> > > > > > > > > > > > > > So that we can estimate the completeness and
> > > complexity
> > > > > of
> > > > > > > the
> > > > > > > > > > > > proposal.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Of course, the solution does not affect on
> > existing
> > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > there is no changes if user not use the
> > annotation
> > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > Only if user make decision to use compression
> on
> > > > > specific
> > > > > > > > field
> > > > > > > > > > or
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > of a class - in that case compression will be
> > used
> > > at
> > > > > > > > > marshalling
> > > > > > > > > > > in
> > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Is it possible to propose implementation that
> > can
> > > > be
> > > > > > > > switched
> > > > > > > > > > on
> > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > In this case it should not affect performance
> > of
> > > > > > current
> > > > > > > > > > > solution.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I mean, that users should make decision what
> is
> > > > more
> > > > > > > > > important
> > > > > > > > > > > for
> > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > May be they will be choose not all objects,
> or
> > > only
> > > > > > some
> > > > > > > > > > > attributes
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> Daradur <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > Provided solution allows reduce size of an
> > > object
> > > > > in
> > > > > > > > > > > IgniteCache
> > > > > > > > > > > > at
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > cost of throughput reduction (small - in
> some
> > > > > cases),
> > > > > > > it
> > > > > > > > > > > depends
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > object which will be compressed and
> > compression
> > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > > memory,
> > > > > and
> > > > > > > in
> > > > > > > > > some
> > > > > > > > > > > > cases
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > (replication,
> > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Especially, it will be particularly useful
> > for
> > > > > > object's
> > > > > > > > > > fields
> > > > > > > > > > > > > which
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > effectively
> > > > > > > > > compressed.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> please
> > > > > > provide a
> > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > Daradur <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > object
> > > > > size,
> > > > > > > with
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> literal
> > > > text)
> > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > using
> > > > > > > different
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > objects
> > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > object
> > > > > size,
> > > > > > > with
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> badly
> > > > > > compressed
> > > > > > > > > > > character
> > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > using
> > > > > > > different
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > algrithms depending on size of
> compressed
> > > > > field.
> > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > objects
> > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of
> the
> > > > "put"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of
> the
> > > > "put"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of
> the
> > > > "get"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of
> the
> > > > "get"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > Setrakyan
> > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > interpret
> > > > the
> > > > > > > > graphs?
> > > > > > > > > > What
> > > > > > > > > > > > are
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > Vyacheslav
> > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > > Results
> > > > > [1].
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation in
> > the
> > > > > form
> > > > > > of
> > > > > > > > > > > diagrams
> > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > > community
> > > > > > and
> > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/
> d/
> > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > Vyacheslav
> > > > > > > Daradur <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show my
> > > idea.
> > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> copied
> > > > > existing
> > > > > > > > tests
> > > > > > > > > > and
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> > marked
> > > > by
> > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > > existing
> > > > > > data
> > > > > > > > or
> > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> thougths.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I want
> to
> > > > show
> > > > > > it.
> > > > > > > > > > > > > > > > > > > > > >>> It is always easier to discuss
> on
> > > > > > example.
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > Dmitriy
> > > > > > > Setrakyan
> > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit premature
> to
> > > > > > provide a
> > > > > > > > PR
> > > > > > > > > > > > without
> > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > Please
> > > > > allow
> > > > > > > some
> > > > > > > > > > time
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36
> AM,
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > described
> > > > > > > solution
> > > > > > > > in
> > > > > > > > > > > > couple
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > > Vyacheslav
> > > > > > > > > Daradur
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > discussion
> > > > > about
> > > > > > a
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> only
> > > one
> > > > > > > solution
> > > > > > > > > > which
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > per-objects-field
> > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> means
> > > > that
> > > > > > > > metadata
> > > > > > > > > > (a
> > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > serialized
> > > > > > values
> > > > > > > of
> > > > > > > > > an
> > > > > > > > > > > > object
> > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > > contentious
> > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > primitives
> > > > > and
> > > > > > > > short
> > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible to
> > use
> > > > > > > > compression
> > > > > > > > > > with
> > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > annotation,
> > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> marking
> > > > > fields
> > > > > > to
> > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have
> > > ready
> > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > > public
> > > > > API
> > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> > some a
> > > > > > > configure
> > > > > > > > > > > entity
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > > Compressor
> > > > > > > > > interface
> > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > BinaryMarshaller
> > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> GMT+03:00
> > > > Alexey
> > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > discussion
> > > > > > [1]
> > > > > > > > > about
> > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we
> > > agreed
> > > > > to
> > > > > > > add
> > > > > > > > > only
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> > some
> > > > sort
> > > > > > of
> > > > > > > > > custom
> > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > http://apache-ignite-developer
> > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> ompression-in-Ignite-2-0-
> > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at
> > 2:19
> > > > PM,
> > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> this
> > > > task.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > > pluggable
> > > > > > > > > > compression
> > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > https://issues.apache.org/
> > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution
> > on
> > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > discussion
> > > of
> > > > > > task
> > > > > > > > > goals
> > > > > > > > > > > and
> > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that,
> > the
> > > > main
> > > > > > > goal
> > > > > > > > of
> > > > > > > > > > > this
> > > > > > > > > > > > > task
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need
> from
> > > > > Ignite
> > > > > > as
> > > > > > > > its
> > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data
> > on
> > > > same
> > > > > > > > servers
> > > > > > > > > > at
> > > > > > > > > > > > the
> > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > possibility
> > > > > of
> > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > > context:
> > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > > Ignite
> > > > > > > > Developers
> > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best Regards, Anton Churaev
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Vyacheslav
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav
> >
>



--
Best Regards, Vyacheslav
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Антон Чураев
In reply to this post by Vladimir Ozerov
Seems that Dmitry is referring to transparent data encryption. It is used
throughout the whale database industry.

2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Dima,
>
> Encryption of certain fields is as bad as compression. First, it is a huge
> change, which makes already complex binary protocol even more complex.
> Second, it have to be ported to CPP, .NET platforms, as well as to JDBC and
> ODBC.
> Last, but the most important - this is not our headache to encrypt
> sensitive data. This is user responsibility. Nobody in a sane mind will
> store passwords in plain form. Instead, user should encrypt it on his own,
> choosing proper encryption parameters - algorithms, key lengths, salts,
> etc.. How are you going to expose this in API or configuration?
>
> We should not implement data encryption on binary level, this is out of
> question. Encryption should be implemented on application level (user
> efforts), transport layer (SSL - we already have it), and possibly on
> disk-level (there are tools for this already).
>
>
> On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
> wrote:
>
> > >> which is much less useful.
> > I note, in some cases there is profit more than twice per size of an
> > object.
> >
> > >> Would it be possible to change your implementation to handle the
> > encryption instead?
> > Yes, of cource, there's not much difference between compression and
> > encryption, including in my implementation of per-field-compression.
> >
> > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >
> > > Vyacheslav,
> > >
> > > When this feature started out as data compression in Ignite, it sounded
> > > very useful. Now it is unfolding as a per-field compression, which is
> > much
> > > less useful. In fact, it is questionable whether it is useful at all.
> The
> > > fact that this feature is implemented does not make it mandatory for
> the
> > > community to accept it.
> > >
> > > However, as I mentioned before, per-field encryption is very useful, as
> > it
> > > would allow users automatically encrypt certain sensitive fields, like
> > > passwords, credit card numbers, etc. There is not much conceptual
> > > difference between compressing a field vs encrypting a field. Would it
> be
> > > possible to change your implementation to handle the encryption
> instead?
> > >
> > > D.
> > >
> > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> [hidden email]
> > >
> > > wrote:
> > >
> > > > Guys, I want to be clear:
> > > > * "Per-field compression" design is the result of a research of the
> > > binary
> > > > infrastructure of Ignite and some other its places (querying,
> indexing,
> > > > etc.)
> > > > * Full-compression of object will be more effective, but in this case
> > > there
> > > > is no capability with querying and indexing (or there is large
> overhead
> > > by
> > > > way of decompressing of full object (or caches pages) on demand)
> > > > * "Per-field compression" is a one of ways to implement the
> compression
> > > > feature
> > > >
> > > > I'm new to Ignite also I can be mistaken in some things.
> > > > Last 3-4 month I've tryed to start dicussion about a design, but
> nobody
> > > > answers nothing (except Dmitry and Valentin who was interested how it
> > > > works).
> > > > But I understand that this is community and nobody is obliged to
> > anybody.
> > > >
> > > > There are strong Ignite experts.
> > > > If they can help me and community with a design of the compression
> > > feature
> > > > it will be great.
> > > > At the moment I have a desire and time to be engaged in development
> of
> > > > compression feature in Ignite.
> > > > Let's use this opportunity :)
> > > >
> > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > > >
> > > > > Igniters,
> > > > >
> > > > > I have never seen a single Ignite user asking about compressing a
> > > single
> > > > > field. However, we have had requests to secure certain fields, e.g.
> > > > > passwords.
> > > > >
> > > > > I personally do not think per-field compression is needed, unless
> we
> > > can
> > > > > point out some concrete real life use cases.
> > > > >
> > > > > D.
> > > > >
> > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Anton,
> > > > > >
> > > > > > >> I thought that if there will storing compressed data in the
> > > memory,
> > > > > data
> > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > >
> > > > > > In per-field compression case - yes.
> > > > > >
> > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]>:
> > > > > >
> > > > > > > Guys, could you please help me.
> > > > > > > I thought that if there will storing compressed data in the
> > memory,
> > > > > data
> > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > >
> > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > [hidden email]
> > > >:
> > > > > > >
> > > > > > > > Vladimir,
> > > > > > > >
> > > > > > > > The main problem which I'am trying to solve is storing data
> in
> > > > memory
> > > > > > in
> > > > > > > a
> > > > > > > > compression form via Ignite.
> > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > >
> > > > > > > > >> here the much simpler step would be to full
> > > > > > > > compression on per-cache basis rather than dealing with
> > > per-fields
> > > > > > case.
> > > > > > > >
> > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > Is it compatible with quering and indexing?
> > > > > > > >
> > > > > > > > >> In the end, if user would like to compress particular
> field,
> > > he
> > > > > can
> > > > > > > > always to it on his own
> > > > > > > > I think we mustn't think in this way, if user need something
> he
> > > > > trying
> > > > > > to
> > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > [hidden email]
> > > >:
> > > > > > > >
> > > > > > > > > Igniters,
> > > > > > > > >
> > > > > > > > > Honestly I still do not see how to apply it gracefully this
> > > > feature
> > > > > > ti
> > > > > > > > > Ignite. And overall approach to compress only particular
> > fields
> > > > > looks
> > > > > > > > > overcomplicated to me. Remember, that our main use case is
> an
> > > > > > > application
> > > > > > > > > without classes on the server. It means that any kind of
> > > > > annotations
> > > > > > > are
> > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > implemented
> > > > > to
> > > > > > > > > handle no-class case (e.g. how would build such an object
> > > through
> > > > > > > > > BinaryBuilder without a class?), and only then add
> > annotations
> > > as
> > > > > > > > > convenient addition to more basic API.
> > > > > > > > >
> > > > > > > > > It seems to me that full implementation, which takes in
> count
> > > > > proper
> > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > compressed
> > > > > > > fields,
> > > > > > > > > changes to SQL, changes to binary protocol, and porting to
> > .NET
> > > > and
> > > > > > > CPP,
> > > > > > > > > will yield very complex solution with little value to the
> > > > product.
> > > > > > > > >
> > > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> > start
> > > > > with
> > > > > > > the
> > > > > > > > > problem we are trying to solve. Basically, compression
> could
> > > help
> > > > > in
> > > > > > > two
> > > > > > > > > cases:
> > > > > > > > > 1) Transmitting data over wire - it should be implemented
> on
> > > > > > > > communication
> > > > > > > > > layer and should not affect binary serialization component
> a
> > > lot.
> > > > > > > > > 2) Storing data in memory - here the much simpler step
> would
> > be
> > > > to
> > > > > > full
> > > > > > > > > compression on per-cache basis rather than dealing with
> > > > per-fields
> > > > > > > case.
> > > > > > > > >
> > > > > > > > > In the end, if user would like to compress particular
> field,
> > he
> > > > can
> > > > > > > > always
> > > > > > > > > to it on his own, and set already compressed field to our
> > > > > > BinaryObject.
> > > > > > > > >
> > > > > > > > > Vladimir.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Valentin,
> > > > > > > > > >
> > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > >
> > > > > > > > > > You can see an example of Java class[3] that I used in my
> > > > > > benchmark.
> > > > > > > > > > For example:
> > > > > > > > > > class Foo {
> > > > > > > > > > @BinaryCompression
> > > > > > > > > > String data;
> > > > > > > > > > }
> > > > > > > > > > If user make decision to store the object in compressed
> > form,
> > > > he
> > > > > > can
> > > > > > > > use
> > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > marshalling.
> > > > > > > > > >
> > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > > [3]
> > > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > > model/Audit1F.java
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > >
> > > > > > > > > > > Are there any ideas and/or prototypes for the API? Your
> > > > design
> > > > > > > > > > suggestions
> > > > > > > > > > > seem to make sense, but I would like to see how it all
> > this
> > > > > will
> > > > > > > like
> > > > > > > > > > from
> > > > > > > > > > > user's standpoint.
> > > > > > > > > > >
> > > > > > > > > > > -Val
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > >
> > > > > > > > > > > > We could provide opportunity of choose between CPU
> > usage
> > > > and
> > > > > > > > MEM/NET
> > > > > > > > > > > usage
> > > > > > > > > > > > for users by compression some attributes of stored
> > > objects.
> > > > > > > > > > > > You have learned design, and it is possible to
> localize
> > > > > changes
> > > > > > > in
> > > > > > > > > > > > marshalling without performance affect and current
> > > > > > functionality.
> > > > > > > > > > > >
> > > > > > > > > > > > I think, that it's usefull for our project and users.
> > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > In short,
> > > > > > > > > > > > >
> > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > BinaryFieldAccessor
> > > > > > > > > > which
> > > > > > > > > > > > > manages its marshalling. It checks if the field is
> > > marked
> > > > > by
> > > > > > > > > > annotation
> > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > representation
> > > > > of
> > > > > > > > field
> > > > > > > > > > > (bytes
> > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > compressed
> > > > > by
> > > > > > > > types
> > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED), after
> > this
> > > > the
> > > > > > > > > > compressed
> > > > > > > > > > > > > bytes
> > > > > > > > > > > > > array wiil be include in binary representation of
> > whole
> > > > > > object.
> > > > > > > > > Note,
> > > > > > > > > > > > > header of marshalled object will not be compressed.
> > > > > > Compression
> > > > > > > > > > > affected
> > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Objects in IgniteCache is represented as
> BinaryObject
> > > > which
> > > > > > is
> > > > > > > > > > wrapper
> > > > > > > > > > > > over
> > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > BinaryObject provides some usefull methods, which
> are
> > > > used
> > > > > by
> > > > > > > > > Ignite
> > > > > > > > > > > > > systems.
> > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > method,
> > > > > which
> > > > > > > > > > > > deserializes
> > > > > > > > > > > > > only field of object, without deserializing of
> whole
> > > > > object.
> > > > > > > > > > > > > BinaryObject#field method during deserialization,
> if
> > > > meets
> > > > > > the
> > > > > > > > > > constant
> > > > > > > > > > > > of
> > > > > > > > > > > > > compressed type, decompress this bytes array, then
> > > > continue
> > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > as usual.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > IgniteConfigurations,
> > > > > > > > > > it
> > > > > > > > > > > > > allows user to use own implementation of
> compressor -
> > > it
> > > > is
> > > > > > the
> > > > > > > > > > > > requirement
> > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > >
> > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like the
> > idea
> > > > of
> > > > > > > > granting
> > > > > > > > > > > this
> > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > In that case we can choose a compression algorithm
> > > which
> > > > we
> > > > > > > will
> > > > > > > > > > > provide
> > > > > > > > > > > > by
> > > > > > > > > > > > > default and will move the interface to internals of
> > > > binary
> > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > For this case I've prepared benchmarked, which I've
> > > sent
> > > > > > > earlier.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > compression
> > > > > > > ratio
> > > > > > > > > and
> > > > > > > > > > > good
> > > > > > > > > > > > > throughput. It has implementation in Java, .NET and
> > > C++,
> > > > > and
> > > > > > > has
> > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> Ignite
> > > > > > > platforms.
> > > > > > > > > > > > > You can look at an assessment of this algorithm in
> my
> > > > > > > benchmark's
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] https://issues.apache.org/
> > jira/browse/IGNITE-3592
> > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Could You propose design of implementation in
> > couple
> > > of
> > > > > > > > > sentences?
> > > > > > > > > > > > > > So that we can estimate the completeness and
> > > complexity
> > > > > of
> > > > > > > the
> > > > > > > > > > > > proposal.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Of course, the solution does not affect on
> > existing
> > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > there is no changes if user not use the
> > annotation
> > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > Only if user make decision to use compression
> on
> > > > > specific
> > > > > > > > field
> > > > > > > > > > or
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > of a class - in that case compression will be
> > used
> > > at
> > > > > > > > > marshalling
> > > > > > > > > > > in
> > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Is it possible to propose implementation that
> > can
> > > > be
> > > > > > > > switched
> > > > > > > > > > on
> > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > In this case it should not affect performance
> > of
> > > > > > current
> > > > > > > > > > > solution.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I mean, that users should make decision what
> is
> > > > more
> > > > > > > > > important
> > > > > > > > > > > for
> > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > May be they will be choose not all objects,
> or
> > > only
> > > > > > some
> > > > > > > > > > > attributes
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> Daradur <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > Provided solution allows reduce size of an
> > > object
> > > > > in
> > > > > > > > > > > IgniteCache
> > > > > > > > > > > > at
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > cost of throughput reduction (small - in
> some
> > > > > cases),
> > > > > > > it
> > > > > > > > > > > depends
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > object which will be compressed and
> > compression
> > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > > memory,
> > > > > and
> > > > > > > in
> > > > > > > > > some
> > > > > > > > > > > > cases
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > (replication,
> > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Especially, it will be particularly useful
> > for
> > > > > > object's
> > > > > > > > > > fields
> > > > > > > > > > > > > which
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > effectively
> > > > > > > > > compressed.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> please
> > > > > > provide a
> > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > Daradur <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > object
> > > > > size,
> > > > > > > with
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> literal
> > > > text)
> > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > using
> > > > > > > different
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > depending on size of compressed field.
> > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > objects
> > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > object
> > > > > size,
> > > > > > > with
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> badly
> > > > > > compressed
> > > > > > > > > > > character
> > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > using
> > > > > > > different
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > algrithms depending on size of
> compressed
> > > > > field.
> > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > objects
> > > > > > > > depending
> > > > > > > > > > on
> > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of
> the
> > > > "put"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of
> the
> > > > "put"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of
> the
> > > > "get"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of
> the
> > > > "get"
> > > > > > > > > operation
> > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > Setrakyan
> > > > <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > interpret
> > > > the
> > > > > > > > graphs?
> > > > > > > > > > What
> > > > > > > > > > > > are
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > Vyacheslav
> > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > > Results
> > > > > [1].
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation in
> > the
> > > > > form
> > > > > > of
> > > > > > > > > > > diagrams
> > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > > community
> > > > > > and
> > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > > [2] https://drive.google.com/file/
> d/
> > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> Vyacheslav
> > > > > Daradur
> > > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > Vyacheslav
> > > > > > > Daradur <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show my
> > > idea.
> > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> copied
> > > > > existing
> > > > > > > > tests
> > > > > > > > > > and
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> > marked
> > > > by
> > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > > existing
> > > > > > data
> > > > > > > > or
> > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> thougths.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I want
> to
> > > > show
> > > > > > it.
> > > > > > > > > > > > > > > > > > > > > >>> It is always easier to discuss
> on
> > > > > > example.
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > Dmitriy
> > > > > > > Setrakyan
> > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit premature
> to
> > > > > > provide a
> > > > > > > > PR
> > > > > > > > > > > > without
> > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > Please
> > > > > allow
> > > > > > > some
> > > > > > > > > > time
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36
> AM,
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > described
> > > > > > > solution
> > > > > > > > in
> > > > > > > > > > > > couple
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > > Vyacheslav
> > > > > > > > > Daradur
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > discussion
> > > > > about
> > > > > > a
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> only
> > > one
> > > > > > > solution
> > > > > > > > > > which
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > per-objects-field
> > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> means
> > > > that
> > > > > > > > metadata
> > > > > > > > > > (a
> > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > serialized
> > > > > > values
> > > > > > > of
> > > > > > > > > an
> > > > > > > > > > > > object
> > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be compressed.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > > contentious
> > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > primitives
> > > > > and
> > > > > > > > short
> > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible to
> > use
> > > > > > > > compression
> > > > > > > > > > with
> > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > annotation,
> > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> marking
> > > > > fields
> > > > > > to
> > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already have
> > > ready
> > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06 GMT+03:00
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > > public
> > > > > API
> > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> > some a
> > > > > > > configure
> > > > > > > > > > > entity
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > > Compressor
> > > > > > > > > interface
> > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > BinaryMarshaller
> > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> GMT+03:00
> > > > Alexey
> > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > discussion
> > > > > > [1]
> > > > > > > > > about
> > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember we
> > > agreed
> > > > > to
> > > > > > > add
> > > > > > > > > only
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> > some
> > > > sort
> > > > > > of
> > > > > > > > > custom
> > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > http://apache-ignite-developer
> > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> ompression-in-Ignite-2-0-
> > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017 at
> > 2:19
> > > > PM,
> > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> this
> > > > task.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > > pluggable
> > > > > > > > > > compression
> > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > https://issues.apache.org/
> > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a solution
> > on
> > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > discussion
> > > of
> > > > > > task
> > > > > > > > > goals
> > > > > > > > > > > and
> > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood that,
> > the
> > > > main
> > > > > > > goal
> > > > > > > > of
> > > > > > > > > > > this
> > > > > > > > > > > > > task
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need
> from
> > > > > Ignite
> > > > > > as
> > > > > > > > its
> > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more data
> > on
> > > > same
> > > > > > > > servers
> > > > > > > > > > at
> > > > > > > > > > > > the
> > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > possibility
> > > > > of
> > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > > context:
> > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > > Ignite
> > > > > > > > Developers
> > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > >
> > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best Regards, Anton Churaev
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Vyacheslav
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav
> >
>



--

Best Regards, Anton Churaev
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Sergi
+1 to Vladimir. Fields encryption is a user responsibility. I see no reason
to introduce additional complexity to Ignite.

Sergi

2017-06-09 11:11 GMT+03:00 Антон Чураев <[hidden email]>:

> Seems that Dmitry is referring to transparent data encryption. It is used
> throughout the whale database industry.
>
> 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Dima,
> >
> > Encryption of certain fields is as bad as compression. First, it is a
> huge
> > change, which makes already complex binary protocol even more complex.
> > Second, it have to be ported to CPP, .NET platforms, as well as to JDBC
> and
> > ODBC.
> > Last, but the most important - this is not our headache to encrypt
> > sensitive data. This is user responsibility. Nobody in a sane mind will
> > store passwords in plain form. Instead, user should encrypt it on his
> own,
> > choosing proper encryption parameters - algorithms, key lengths, salts,
> > etc.. How are you going to expose this in API or configuration?
> >
> > We should not implement data encryption on binary level, this is out of
> > question. Encryption should be implemented on application level (user
> > efforts), transport layer (SSL - we already have it), and possibly on
> > disk-level (there are tools for this already).
> >
> >
> > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
> > wrote:
> >
> > > >> which is much less useful.
> > > I note, in some cases there is profit more than twice per size of an
> > > object.
> > >
> > > >> Would it be possible to change your implementation to handle the
> > > encryption instead?
> > > Yes, of cource, there's not much difference between compression and
> > > encryption, including in my implementation of per-field-compression.
> > >
> > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > Vyacheslav,
> > > >
> > > > When this feature started out as data compression in Ignite, it
> sounded
> > > > very useful. Now it is unfolding as a per-field compression, which is
> > > much
> > > > less useful. In fact, it is questionable whether it is useful at all.
> > The
> > > > fact that this feature is implemented does not make it mandatory for
> > the
> > > > community to accept it.
> > > >
> > > > However, as I mentioned before, per-field encryption is very useful,
> as
> > > it
> > > > would allow users automatically encrypt certain sensitive fields,
> like
> > > > passwords, credit card numbers, etc. There is not much conceptual
> > > > difference between compressing a field vs encrypting a field. Would
> it
> > be
> > > > possible to change your implementation to handle the encryption
> > instead?
> > > >
> > > > D.
> > > >
> > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Guys, I want to be clear:
> > > > > * "Per-field compression" design is the result of a research of the
> > > > binary
> > > > > infrastructure of Ignite and some other its places (querying,
> > indexing,
> > > > > etc.)
> > > > > * Full-compression of object will be more effective, but in this
> case
> > > > there
> > > > > is no capability with querying and indexing (or there is large
> > overhead
> > > > by
> > > > > way of decompressing of full object (or caches pages) on demand)
> > > > > * "Per-field compression" is a one of ways to implement the
> > compression
> > > > > feature
> > > > >
> > > > > I'm new to Ignite also I can be mistaken in some things.
> > > > > Last 3-4 month I've tryed to start dicussion about a design, but
> > nobody
> > > > > answers nothing (except Dmitry and Valentin who was interested how
> it
> > > > > works).
> > > > > But I understand that this is community and nobody is obliged to
> > > anybody.
> > > > >
> > > > > There are strong Ignite experts.
> > > > > If they can help me and community with a design of the compression
> > > > feature
> > > > > it will be great.
> > > > > At the moment I have a desire and time to be engaged in development
> > of
> > > > > compression feature in Ignite.
> > > > > Let's use this opportunity :)
> > > > >
> > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]
> >:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > I have never seen a single Ignite user asking about compressing a
> > > > single
> > > > > > field. However, we have had requests to secure certain fields,
> e.g.
> > > > > > passwords.
> > > > > >
> > > > > > I personally do not think per-field compression is needed, unless
> > we
> > > > can
> > > > > > point out some concrete real life use cases.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Anton,
> > > > > > >
> > > > > > > >> I thought that if there will storing compressed data in the
> > > > memory,
> > > > > > data
> > > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > > >
> > > > > > > In per-field compression case - yes.
> > > > > > >
> > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]
> >:
> > > > > > >
> > > > > > > > Guys, could you please help me.
> > > > > > > > I thought that if there will storing compressed data in the
> > > memory,
> > > > > > data
> > > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > > >
> > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > > [hidden email]
> > > > >:
> > > > > > > >
> > > > > > > > > Vladimir,
> > > > > > > > >
> > > > > > > > > The main problem which I'am trying to solve is storing data
> > in
> > > > > memory
> > > > > > > in
> > > > > > > > a
> > > > > > > > > compression form via Ignite.
> > > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > > >
> > > > > > > > > >> here the much simpler step would be to full
> > > > > > > > > compression on per-cache basis rather than dealing with
> > > > per-fields
> > > > > > > case.
> > > > > > > > >
> > > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > > Is it compatible with quering and indexing?
> > > > > > > > >
> > > > > > > > > >> In the end, if user would like to compress particular
> > field,
> > > > he
> > > > > > can
> > > > > > > > > always to it on his own
> > > > > > > > > I think we mustn't think in this way, if user need
> something
> > he
> > > > > > trying
> > > > > > > to
> > > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > > [hidden email]
> > > > >:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > Honestly I still do not see how to apply it gracefully
> this
> > > > > feature
> > > > > > > ti
> > > > > > > > > > Ignite. And overall approach to compress only particular
> > > fields
> > > > > > looks
> > > > > > > > > > overcomplicated to me. Remember, that our main use case
> is
> > an
> > > > > > > > application
> > > > > > > > > > without classes on the server. It means that any kind of
> > > > > > annotations
> > > > > > > > are
> > > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > > implemented
> > > > > > to
> > > > > > > > > > handle no-class case (e.g. how would build such an object
> > > > through
> > > > > > > > > > BinaryBuilder without a class?), and only then add
> > > annotations
> > > > as
> > > > > > > > > > convenient addition to more basic API.
> > > > > > > > > >
> > > > > > > > > > It seems to me that full implementation, which takes in
> > count
> > > > > > proper
> > > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > > compressed
> > > > > > > > fields,
> > > > > > > > > > changes to SQL, changes to binary protocol, and porting
> to
> > > .NET
> > > > > and
> > > > > > > > CPP,
> > > > > > > > > > will yield very complex solution with little value to the
> > > > > product.
> > > > > > > > > >
> > > > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> > > start
> > > > > > with
> > > > > > > > the
> > > > > > > > > > problem we are trying to solve. Basically, compression
> > could
> > > > help
> > > > > > in
> > > > > > > > two
> > > > > > > > > > cases:
> > > > > > > > > > 1) Transmitting data over wire - it should be implemented
> > on
> > > > > > > > > communication
> > > > > > > > > > layer and should not affect binary serialization
> component
> > a
> > > > lot.
> > > > > > > > > > 2) Storing data in memory - here the much simpler step
> > would
> > > be
> > > > > to
> > > > > > > full
> > > > > > > > > > compression on per-cache basis rather than dealing with
> > > > > per-fields
> > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > In the end, if user would like to compress particular
> > field,
> > > he
> > > > > can
> > > > > > > > > always
> > > > > > > > > > to it on his own, and set already compressed field to our
> > > > > > > BinaryObject.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Valentin,
> > > > > > > > > > >
> > > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > > >
> > > > > > > > > > > You can see an example of Java class[3] that I used in
> my
> > > > > > > benchmark.
> > > > > > > > > > > For example:
> > > > > > > > > > > class Foo {
> > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > String data;
> > > > > > > > > > > }
> > > > > > > > > > > If user make decision to store the object in compressed
> > > form,
> > > > > he
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > > marshalling.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > > > [3]
> > > > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > > > model/Audit1F.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > > >
> > > > > > > > > > > > Are there any ideas and/or prototypes for the API?
> Your
> > > > > design
> > > > > > > > > > > suggestions
> > > > > > > > > > > > seem to make sense, but I would like to see how it
> all
> > > this
> > > > > > will
> > > > > > > > like
> > > > > > > > > > > from
> > > > > > > > > > > > user's standpoint.
> > > > > > > > > > > >
> > > > > > > > > > > > -Val
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > > >
> > > > > > > > > > > > > We could provide opportunity of choose between CPU
> > > usage
> > > > > and
> > > > > > > > > MEM/NET
> > > > > > > > > > > > usage
> > > > > > > > > > > > > for users by compression some attributes of stored
> > > > objects.
> > > > > > > > > > > > > You have learned design, and it is possible to
> > localize
> > > > > > changes
> > > > > > > > in
> > > > > > > > > > > > > marshalling without performance affect and current
> > > > > > > functionality.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think, that it's usefull for our project and
> users.
> > > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > In short,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > > BinaryFieldAccessor
> > > > > > > > > > > which
> > > > > > > > > > > > > > manages its marshalling. It checks if the field
> is
> > > > marked
> > > > > > by
> > > > > > > > > > > annotation
> > > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > > representation
> > > > > > of
> > > > > > > > > field
> > > > > > > > > > > > (bytes
> > > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > > compressed
> > > > > > by
> > > > > > > > > types
> > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
> after
> > > this
> > > > > the
> > > > > > > > > > > compressed
> > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > array wiil be include in binary representation of
> > > whole
> > > > > > > object.
> > > > > > > > > > Note,
> > > > > > > > > > > > > > header of marshalled object will not be
> compressed.
> > > > > > > Compression
> > > > > > > > > > > > affected
> > > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Objects in IgniteCache is represented as
> > BinaryObject
> > > > > which
> > > > > > > is
> > > > > > > > > > > wrapper
> > > > > > > > > > > > > over
> > > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > > BinaryObject provides some usefull methods, which
> > are
> > > > > used
> > > > > > by
> > > > > > > > > > Ignite
> > > > > > > > > > > > > > systems.
> > > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > > method,
> > > > > > which
> > > > > > > > > > > > > deserializes
> > > > > > > > > > > > > > only field of object, without deserializing of
> > whole
> > > > > > object.
> > > > > > > > > > > > > > BinaryObject#field method during deserialization,
> > if
> > > > > meets
> > > > > > > the
> > > > > > > > > > > constant
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > compressed type, decompress this bytes array,
> then
> > > > > continue
> > > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > > as usual.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > > IgniteConfigurations,
> > > > > > > > > > > it
> > > > > > > > > > > > > > allows user to use own implementation of
> > compressor -
> > > > it
> > > > > is
> > > > > > > the
> > > > > > > > > > > > > requirement
> > > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like
> the
> > > idea
> > > > > of
> > > > > > > > > granting
> > > > > > > > > > > > this
> > > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > > In that case we can choose a compression
> algorithm
> > > > which
> > > > > we
> > > > > > > > will
> > > > > > > > > > > > provide
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > default and will move the interface to internals
> of
> > > > > binary
> > > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > > For this case I've prepared benchmarked, which
> I've
> > > > sent
> > > > > > > > earlier.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > > compression
> > > > > > > > ratio
> > > > > > > > > > and
> > > > > > > > > > > > good
> > > > > > > > > > > > > > throughput. It has implementation in Java, .NET
> and
> > > > C++,
> > > > > > and
> > > > > > > > has
> > > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> > Ignite
> > > > > > > > platforms.
> > > > > > > > > > > > > > You can look at an assessment of this algorithm
> in
> > my
> > > > > > > > benchmark's
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] https://issues.apache.org/
> > > jira/browse/IGNITE-3592
> > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Could You propose design of implementation in
> > > couple
> > > > of
> > > > > > > > > > sentences?
> > > > > > > > > > > > > > > So that we can estimate the completeness and
> > > > complexity
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Of course, the solution does not affect on
> > > existing
> > > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > > there is no changes if user not use the
> > > annotation
> > > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > > Only if user make decision to use compression
> > on
> > > > > > specific
> > > > > > > > > field
> > > > > > > > > > > or
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > of a class - in that case compression will be
> > > used
> > > > at
> > > > > > > > > > marshalling
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Is it possible to propose implementation
> that
> > > can
> > > > > be
> > > > > > > > > switched
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > > In this case it should not affect
> performance
> > > of
> > > > > > > current
> > > > > > > > > > > > solution.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I mean, that users should make decision
> what
> > is
> > > > > more
> > > > > > > > > > important
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > > May be they will be choose not all objects,
> > or
> > > > only
> > > > > > > some
> > > > > > > > > > > > attributes
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > > Provided solution allows reduce size of
> an
> > > > object
> > > > > > in
> > > > > > > > > > > > IgniteCache
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > cost of throughput reduction (small - in
> > some
> > > > > > cases),
> > > > > > > > it
> > > > > > > > > > > > depends
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > object which will be compressed and
> > > compression
> > > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > > > memory,
> > > > > > and
> > > > > > > > in
> > > > > > > > > > some
> > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > > (replication,
> > > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Especially, it will be particularly
> useful
> > > for
> > > > > > > object's
> > > > > > > > > > > fields
> > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > > effectively
> > > > > > > > > > compressed.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> > please
> > > > > > > provide a
> > > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > > Daradur <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > literal
> > > > > text)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > > depending on size of compressed
> field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > badly
> > > > > > > compressed
> > > > > > > > > > > > character
> > > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > algrithms depending on size of
> > compressed
> > > > > > field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > > Setrakyan
> > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > > interpret
> > > > > the
> > > > > > > > > graphs?
> > > > > > > > > > > What
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > > Vyacheslav
> > > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > > > Results
> > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation
> in
> > > the
> > > > > > form
> > > > > > > of
> > > > > > > > > > > > diagrams
> > > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > > > community
> > > > > > > and
> > > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > > > [2]
> https://drive.google.com/file/
> > d/
> > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > > Vyacheslav
> > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show
> my
> > > > idea.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> > copied
> > > > > > existing
> > > > > > > > > tests
> > > > > > > > > > > and
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> > > marked
> > > > > by
> > > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > > > existing
> > > > > > > data
> > > > > > > > > or
> > > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> > thougths.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
> want
> > to
> > > > > show
> > > > > > > it.
> > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
> discuss
> > on
> > > > > > > example.
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > > Dmitriy
> > > > > > > > Setrakyan
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
> premature
> > to
> > > > > > > provide a
> > > > > > > > > PR
> > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > > Please
> > > > > > allow
> > > > > > > > some
> > > > > > > > > > > time
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36
> > AM,
> > > > > > > > Vyacheslav
> > > > > > > > > > > > Daradur
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > > described
> > > > > > > > solution
> > > > > > > > > in
> > > > > > > > > > > > > couple
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > > discussion
> > > > > > about
> > > > > > > a
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> > only
> > > > one
> > > > > > > > solution
> > > > > > > > > > > which
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > > per-objects-field
> > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> > means
> > > > > that
> > > > > > > > > metadata
> > > > > > > > > > > (a
> > > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > > serialized
> > > > > > > values
> > > > > > > > of
> > > > > > > > > > an
> > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
> compressed.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > > > contentious
> > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > > primitives
> > > > > > and
> > > > > > > > > short
> > > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible
> to
> > > use
> > > > > > > > > compression
> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > > annotation,
> > > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> > marking
> > > > > > fields
> > > > > > > to
> > > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already
> have
> > > > ready
> > > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
> GMT+03:00
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > > > public
> > > > > > API
> > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> > > some a
> > > > > > > > configure
> > > > > > > > > > > > entity
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > > > Compressor
> > > > > > > > > > interface
> > > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > > BinaryMarshaller
> > > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> > GMT+03:00
> > > > > Alexey
> > > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > > discussion
> > > > > > > [1]
> > > > > > > > > > about
> > > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember
> we
> > > > agreed
> > > > > > to
> > > > > > > > add
> > > > > > > > > > only
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> > > some
> > > > > sort
> > > > > > > of
> > > > > > > > > > custom
> > > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > http://apache-ignite-developer
> > > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > ompression-in-Ignite-2-0-
> > > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017
> at
> > > 2:19
> > > > > PM,
> > > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> > this
> > > > > task.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > > > pluggable
> > > > > > > > > > > compression
> > > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > > https://issues.apache.org/
> > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
> solution
> > > on
> > > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > > discussion
> > > > of
> > > > > > > task
> > > > > > > > > > goals
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
> that,
> > > the
> > > > > main
> > > > > > > > goal
> > > > > > > > > of
> > > > > > > > > > > > this
> > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need
> > from
> > > > > > Ignite
> > > > > > > as
> > > > > > > > > its
> > > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more
> data
> > > on
> > > > > same
> > > > > > > > > servers
> > > > > > > > > > > at
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > > possibility
> > > > > > of
> > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > > > context:
> > > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > > > Ignite
> > > > > > > > > Developers
> > > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
> Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards, Anton Churaev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
>
>
>
> --
>
> Best Regards, Anton Churaev
>
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Sergey Kozlov
In reply to this post by Антон Чураев
Hi

* "Per-field compression" is applicable for huge BLOB fields and will
impose the restrictions like unable ot index such fields, slower getting
data, potential OOM issues if compression ration is too high.
But for some cases it makes sense

On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев <[hidden email]> wrote:

> Seems that Dmitry is referring to transparent data encryption. It is used
> throughout the whale database industry.
>
> 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Dima,
> >
> > Encryption of certain fields is as bad as compression. First, it is a
> huge
> > change, which makes already complex binary protocol even more complex.
> > Second, it have to be ported to CPP, .NET platforms, as well as to JDBC
> and
> > ODBC.
> > Last, but the most important - this is not our headache to encrypt
> > sensitive data. This is user responsibility. Nobody in a sane mind will
> > store passwords in plain form. Instead, user should encrypt it on his
> own,
> > choosing proper encryption parameters - algorithms, key lengths, salts,
> > etc.. How are you going to expose this in API or configuration?
> >
> > We should not implement data encryption on binary level, this is out of
> > question. Encryption should be implemented on application level (user
> > efforts), transport layer (SSL - we already have it), and possibly on
> > disk-level (there are tools for this already).
> >
> >
> > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
> > wrote:
> >
> > > >> which is much less useful.
> > > I note, in some cases there is profit more than twice per size of an
> > > object.
> > >
> > > >> Would it be possible to change your implementation to handle the
> > > encryption instead?
> > > Yes, of cource, there's not much difference between compression and
> > > encryption, including in my implementation of per-field-compression.
> > >
> > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > Vyacheslav,
> > > >
> > > > When this feature started out as data compression in Ignite, it
> sounded
> > > > very useful. Now it is unfolding as a per-field compression, which is
> > > much
> > > > less useful. In fact, it is questionable whether it is useful at all.
> > The
> > > > fact that this feature is implemented does not make it mandatory for
> > the
> > > > community to accept it.
> > > >
> > > > However, as I mentioned before, per-field encryption is very useful,
> as
> > > it
> > > > would allow users automatically encrypt certain sensitive fields,
> like
> > > > passwords, credit card numbers, etc. There is not much conceptual
> > > > difference between compressing a field vs encrypting a field. Would
> it
> > be
> > > > possible to change your implementation to handle the encryption
> > instead?
> > > >
> > > > D.
> > > >
> > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Guys, I want to be clear:
> > > > > * "Per-field compression" design is the result of a research of the
> > > > binary
> > > > > infrastructure of Ignite and some other its places (querying,
> > indexing,
> > > > > etc.)
> > > > > * Full-compression of object will be more effective, but in this
> case
> > > > there
> > > > > is no capability with querying and indexing (or there is large
> > overhead
> > > > by
> > > > > way of decompressing of full object (or caches pages) on demand)
> > > > > * "Per-field compression" is a one of ways to implement the
> > compression
> > > > > feature
> > > > >
> > > > > I'm new to Ignite also I can be mistaken in some things.
> > > > > Last 3-4 month I've tryed to start dicussion about a design, but
> > nobody
> > > > > answers nothing (except Dmitry and Valentin who was interested how
> it
> > > > > works).
> > > > > But I understand that this is community and nobody is obliged to
> > > anybody.
> > > > >
> > > > > There are strong Ignite experts.
> > > > > If they can help me and community with a design of the compression
> > > > feature
> > > > > it will be great.
> > > > > At the moment I have a desire and time to be engaged in development
> > of
> > > > > compression feature in Ignite.
> > > > > Let's use this opportunity :)
> > > > >
> > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]
> >:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > I have never seen a single Ignite user asking about compressing a
> > > > single
> > > > > > field. However, we have had requests to secure certain fields,
> e.g.
> > > > > > passwords.
> > > > > >
> > > > > > I personally do not think per-field compression is needed, unless
> > we
> > > > can
> > > > > > point out some concrete real life use cases.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Anton,
> > > > > > >
> > > > > > > >> I thought that if there will storing compressed data in the
> > > > memory,
> > > > > > data
> > > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > > >
> > > > > > > In per-field compression case - yes.
> > > > > > >
> > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]
> >:
> > > > > > >
> > > > > > > > Guys, could you please help me.
> > > > > > > > I thought that if there will storing compressed data in the
> > > memory,
> > > > > > data
> > > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > > >
> > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > > [hidden email]
> > > > >:
> > > > > > > >
> > > > > > > > > Vladimir,
> > > > > > > > >
> > > > > > > > > The main problem which I'am trying to solve is storing data
> > in
> > > > > memory
> > > > > > > in
> > > > > > > > a
> > > > > > > > > compression form via Ignite.
> > > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > > >
> > > > > > > > > >> here the much simpler step would be to full
> > > > > > > > > compression on per-cache basis rather than dealing with
> > > > per-fields
> > > > > > > case.
> > > > > > > > >
> > > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > > Is it compatible with quering and indexing?
> > > > > > > > >
> > > > > > > > > >> In the end, if user would like to compress particular
> > field,
> > > > he
> > > > > > can
> > > > > > > > > always to it on his own
> > > > > > > > > I think we mustn't think in this way, if user need
> something
> > he
> > > > > > trying
> > > > > > > to
> > > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > > [hidden email]
> > > > >:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > Honestly I still do not see how to apply it gracefully
> this
> > > > > feature
> > > > > > > ti
> > > > > > > > > > Ignite. And overall approach to compress only particular
> > > fields
> > > > > > looks
> > > > > > > > > > overcomplicated to me. Remember, that our main use case
> is
> > an
> > > > > > > > application
> > > > > > > > > > without classes on the server. It means that any kind of
> > > > > > annotations
> > > > > > > > are
> > > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > > implemented
> > > > > > to
> > > > > > > > > > handle no-class case (e.g. how would build such an object
> > > > through
> > > > > > > > > > BinaryBuilder without a class?), and only then add
> > > annotations
> > > > as
> > > > > > > > > > convenient addition to more basic API.
> > > > > > > > > >
> > > > > > > > > > It seems to me that full implementation, which takes in
> > count
> > > > > > proper
> > > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > > compressed
> > > > > > > > fields,
> > > > > > > > > > changes to SQL, changes to binary protocol, and porting
> to
> > > .NET
> > > > > and
> > > > > > > > CPP,
> > > > > > > > > > will yield very complex solution with little value to the
> > > > > product.
> > > > > > > > > >
> > > > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> > > start
> > > > > > with
> > > > > > > > the
> > > > > > > > > > problem we are trying to solve. Basically, compression
> > could
> > > > help
> > > > > > in
> > > > > > > > two
> > > > > > > > > > cases:
> > > > > > > > > > 1) Transmitting data over wire - it should be implemented
> > on
> > > > > > > > > communication
> > > > > > > > > > layer and should not affect binary serialization
> component
> > a
> > > > lot.
> > > > > > > > > > 2) Storing data in memory - here the much simpler step
> > would
> > > be
> > > > > to
> > > > > > > full
> > > > > > > > > > compression on per-cache basis rather than dealing with
> > > > > per-fields
> > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > In the end, if user would like to compress particular
> > field,
> > > he
> > > > > can
> > > > > > > > > always
> > > > > > > > > > to it on his own, and set already compressed field to our
> > > > > > > BinaryObject.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Valentin,
> > > > > > > > > > >
> > > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > > >
> > > > > > > > > > > You can see an example of Java class[3] that I used in
> my
> > > > > > > benchmark.
> > > > > > > > > > > For example:
> > > > > > > > > > > class Foo {
> > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > String data;
> > > > > > > > > > > }
> > > > > > > > > > > If user make decision to store the object in compressed
> > > form,
> > > > > he
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > > marshalling.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > > > [3]
> > > > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > > > model/Audit1F.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > > >
> > > > > > > > > > > > Are there any ideas and/or prototypes for the API?
> Your
> > > > > design
> > > > > > > > > > > suggestions
> > > > > > > > > > > > seem to make sense, but I would like to see how it
> all
> > > this
> > > > > > will
> > > > > > > > like
> > > > > > > > > > > from
> > > > > > > > > > > > user's standpoint.
> > > > > > > > > > > >
> > > > > > > > > > > > -Val
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > > >
> > > > > > > > > > > > > We could provide opportunity of choose between CPU
> > > usage
> > > > > and
> > > > > > > > > MEM/NET
> > > > > > > > > > > > usage
> > > > > > > > > > > > > for users by compression some attributes of stored
> > > > objects.
> > > > > > > > > > > > > You have learned design, and it is possible to
> > localize
> > > > > > changes
> > > > > > > > in
> > > > > > > > > > > > > marshalling without performance affect and current
> > > > > > > functionality.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think, that it's usefull for our project and
> users.
> > > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > In short,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > > BinaryFieldAccessor
> > > > > > > > > > > which
> > > > > > > > > > > > > > manages its marshalling. It checks if the field
> is
> > > > marked
> > > > > > by
> > > > > > > > > > > annotation
> > > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > > representation
> > > > > > of
> > > > > > > > > field
> > > > > > > > > > > > (bytes
> > > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > > compressed
> > > > > > by
> > > > > > > > > types
> > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
> after
> > > this
> > > > > the
> > > > > > > > > > > compressed
> > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > array wiil be include in binary representation of
> > > whole
> > > > > > > object.
> > > > > > > > > > Note,
> > > > > > > > > > > > > > header of marshalled object will not be
> compressed.
> > > > > > > Compression
> > > > > > > > > > > > affected
> > > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Objects in IgniteCache is represented as
> > BinaryObject
> > > > > which
> > > > > > > is
> > > > > > > > > > > wrapper
> > > > > > > > > > > > > over
> > > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > > BinaryObject provides some usefull methods, which
> > are
> > > > > used
> > > > > > by
> > > > > > > > > > Ignite
> > > > > > > > > > > > > > systems.
> > > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > > method,
> > > > > > which
> > > > > > > > > > > > > deserializes
> > > > > > > > > > > > > > only field of object, without deserializing of
> > whole
> > > > > > object.
> > > > > > > > > > > > > > BinaryObject#field method during deserialization,
> > if
> > > > > meets
> > > > > > > the
> > > > > > > > > > > constant
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > compressed type, decompress this bytes array,
> then
> > > > > continue
> > > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > > as usual.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > > IgniteConfigurations,
> > > > > > > > > > > it
> > > > > > > > > > > > > > allows user to use own implementation of
> > compressor -
> > > > it
> > > > > is
> > > > > > > the
> > > > > > > > > > > > > requirement
> > > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like
> the
> > > idea
> > > > > of
> > > > > > > > > granting
> > > > > > > > > > > > this
> > > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > > In that case we can choose a compression
> algorithm
> > > > which
> > > > > we
> > > > > > > > will
> > > > > > > > > > > > provide
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > default and will move the interface to internals
> of
> > > > > binary
> > > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > > For this case I've prepared benchmarked, which
> I've
> > > > sent
> > > > > > > > earlier.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > > compression
> > > > > > > > ratio
> > > > > > > > > > and
> > > > > > > > > > > > good
> > > > > > > > > > > > > > throughput. It has implementation in Java, .NET
> and
> > > > C++,
> > > > > > and
> > > > > > > > has
> > > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> > Ignite
> > > > > > > > platforms.
> > > > > > > > > > > > > > You can look at an assessment of this algorithm
> in
> > my
> > > > > > > > benchmark's
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] https://issues.apache.org/
> > > jira/browse/IGNITE-3592
> > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Could You propose design of implementation in
> > > couple
> > > > of
> > > > > > > > > > sentences?
> > > > > > > > > > > > > > > So that we can estimate the completeness and
> > > > complexity
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Of course, the solution does not affect on
> > > existing
> > > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > > there is no changes if user not use the
> > > annotation
> > > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > > Only if user make decision to use compression
> > on
> > > > > > specific
> > > > > > > > > field
> > > > > > > > > > > or
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > of a class - in that case compression will be
> > > used
> > > > at
> > > > > > > > > > marshalling
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Is it possible to propose implementation
> that
> > > can
> > > > > be
> > > > > > > > > switched
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > > In this case it should not affect
> performance
> > > of
> > > > > > > current
> > > > > > > > > > > > solution.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I mean, that users should make decision
> what
> > is
> > > > > more
> > > > > > > > > > important
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > > May be they will be choose not all objects,
> > or
> > > > only
> > > > > > > some
> > > > > > > > > > > > attributes
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > > Provided solution allows reduce size of
> an
> > > > object
> > > > > > in
> > > > > > > > > > > > IgniteCache
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > cost of throughput reduction (small - in
> > some
> > > > > > cases),
> > > > > > > > it
> > > > > > > > > > > > depends
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > object which will be compressed and
> > > compression
> > > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > > > memory,
> > > > > > and
> > > > > > > > in
> > > > > > > > > > some
> > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > > (replication,
> > > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Especially, it will be particularly
> useful
> > > for
> > > > > > > object's
> > > > > > > > > > > fields
> > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > > effectively
> > > > > > > > > > compressed.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> > please
> > > > > > > provide a
> > > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > > Daradur <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > literal
> > > > > text)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > > depending on size of compressed
> field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > badly
> > > > > > > compressed
> > > > > > > > > > > > character
> > > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > algrithms depending on size of
> > compressed
> > > > > > field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > > Setrakyan
> > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > > interpret
> > > > > the
> > > > > > > > > graphs?
> > > > > > > > > > > What
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > > Vyacheslav
> > > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > > > Results
> > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation
> in
> > > the
> > > > > > form
> > > > > > > of
> > > > > > > > > > > > diagrams
> > > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > > > community
> > > > > > > and
> > > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > > > [2]
> https://drive.google.com/file/
> > d/
> > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > > Vyacheslav
> > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show
> my
> > > > idea.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> > copied
> > > > > > existing
> > > > > > > > > tests
> > > > > > > > > > > and
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> > > marked
> > > > > by
> > > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > > > existing
> > > > > > > data
> > > > > > > > > or
> > > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> > thougths.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
> want
> > to
> > > > > show
> > > > > > > it.
> > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
> discuss
> > on
> > > > > > > example.
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > > Dmitriy
> > > > > > > > Setrakyan
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
> premature
> > to
> > > > > > > provide a
> > > > > > > > > PR
> > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > > Please
> > > > > > allow
> > > > > > > > some
> > > > > > > > > > > time
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36
> > AM,
> > > > > > > > Vyacheslav
> > > > > > > > > > > > Daradur
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > > described
> > > > > > > > solution
> > > > > > > > > in
> > > > > > > > > > > > > couple
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > > discussion
> > > > > > about
> > > > > > > a
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> > only
> > > > one
> > > > > > > > solution
> > > > > > > > > > > which
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > > per-objects-field
> > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> > means
> > > > > that
> > > > > > > > > metadata
> > > > > > > > > > > (a
> > > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > > serialized
> > > > > > > values
> > > > > > > > of
> > > > > > > > > > an
> > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
> compressed.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > > > contentious
> > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > > primitives
> > > > > > and
> > > > > > > > > short
> > > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible
> to
> > > use
> > > > > > > > > compression
> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > > annotation,
> > > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> > marking
> > > > > > fields
> > > > > > > to
> > > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already
> have
> > > > ready
> > > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
> GMT+03:00
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > > > public
> > > > > > API
> > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> > > some a
> > > > > > > > configure
> > > > > > > > > > > > entity
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > > > Compressor
> > > > > > > > > > interface
> > > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > > BinaryMarshaller
> > > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> > GMT+03:00
> > > > > Alexey
> > > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > > discussion
> > > > > > > [1]
> > > > > > > > > > about
> > > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember
> we
> > > > agreed
> > > > > > to
> > > > > > > > add
> > > > > > > > > > only
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> > > some
> > > > > sort
> > > > > > > of
> > > > > > > > > > custom
> > > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > http://apache-ignite-developer
> > > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > ompression-in-Ignite-2-0-
> > > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017
> at
> > > 2:19
> > > > > PM,
> > > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> > this
> > > > > task.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > > > pluggable
> > > > > > > > > > > compression
> > > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > > https://issues.apache.org/
> > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
> solution
> > > on
> > > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > > discussion
> > > > of
> > > > > > > task
> > > > > > > > > > goals
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
> that,
> > > the
> > > > > main
> > > > > > > > goal
> > > > > > > > > of
> > > > > > > > > > > > this
> > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need
> > from
> > > > > > Ignite
> > > > > > > as
> > > > > > > > > its
> > > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more
> data
> > > on
> > > > > same
> > > > > > > > > servers
> > > > > > > > > > > at
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > > possibility
> > > > > > of
> > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > > > context:
> > > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > > > Ignite
> > > > > > > > > Developers
> > > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
> Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards, Anton Churaev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
>
>
>
> --
>
> Best Regards, Anton Churaev
>



--
Sergey Kozlov
GridGain Systems
www.gridgain.com
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Sergey Kozlov
In reply to this post by Антон Чураев
Hi

* "Per-field compression" is applicable for huge BLOB fields and will
impose the restrictions like unable ot index such fields, slower getting
data, potential OOM issues if compression ration is too high.
But for some cases it makes sense

On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев <[hidden email]> wrote:

> Seems that Dmitry is referring to transparent data encryption. It is used
> throughout the whale database industry.
>
> 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Dima,
> >
> > Encryption of certain fields is as bad as compression. First, it is a
> huge
> > change, which makes already complex binary protocol even more complex.
> > Second, it have to be ported to CPP, .NET platforms, as well as to JDBC
> and
> > ODBC.
> > Last, but the most important - this is not our headache to encrypt
> > sensitive data. This is user responsibility. Nobody in a sane mind will
> > store passwords in plain form. Instead, user should encrypt it on his
> own,
> > choosing proper encryption parameters - algorithms, key lengths, salts,
> > etc.. How are you going to expose this in API or configuration?
> >
> > We should not implement data encryption on binary level, this is out of
> > question. Encryption should be implemented on application level (user
> > efforts), transport layer (SSL - we already have it), and possibly on
> > disk-level (there are tools for this already).
> >
> >
> > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <[hidden email]>
> > wrote:
> >
> > > >> which is much less useful.
> > > I note, in some cases there is profit more than twice per size of an
> > > object.
> > >
> > > >> Would it be possible to change your implementation to handle the
> > > encryption instead?
> > > Yes, of cource, there's not much difference between compression and
> > > encryption, including in my implementation of per-field-compression.
> > >
> > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > Vyacheslav,
> > > >
> > > > When this feature started out as data compression in Ignite, it
> sounded
> > > > very useful. Now it is unfolding as a per-field compression, which is
> > > much
> > > > less useful. In fact, it is questionable whether it is useful at all.
> > The
> > > > fact that this feature is implemented does not make it mandatory for
> > the
> > > > community to accept it.
> > > >
> > > > However, as I mentioned before, per-field encryption is very useful,
> as
> > > it
> > > > would allow users automatically encrypt certain sensitive fields,
> like
> > > > passwords, credit card numbers, etc. There is not much conceptual
> > > > difference between compressing a field vs encrypting a field. Would
> it
> > be
> > > > possible to change your implementation to handle the encryption
> > instead?
> > > >
> > > > D.
> > > >
> > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Guys, I want to be clear:
> > > > > * "Per-field compression" design is the result of a research of the
> > > > binary
> > > > > infrastructure of Ignite and some other its places (querying,
> > indexing,
> > > > > etc.)
> > > > > * Full-compression of object will be more effective, but in this
> case
> > > > there
> > > > > is no capability with querying and indexing (or there is large
> > overhead
> > > > by
> > > > > way of decompressing of full object (or caches pages) on demand)
> > > > > * "Per-field compression" is a one of ways to implement the
> > compression
> > > > > feature
> > > > >
> > > > > I'm new to Ignite also I can be mistaken in some things.
> > > > > Last 3-4 month I've tryed to start dicussion about a design, but
> > nobody
> > > > > answers nothing (except Dmitry and Valentin who was interested how
> it
> > > > > works).
> > > > > But I understand that this is community and nobody is obliged to
> > > anybody.
> > > > >
> > > > > There are strong Ignite experts.
> > > > > If they can help me and community with a design of the compression
> > > > feature
> > > > > it will be great.
> > > > > At the moment I have a desire and time to be engaged in development
> > of
> > > > > compression feature in Ignite.
> > > > > Let's use this opportunity :)
> > > > >
> > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <[hidden email]
> >:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > I have never seen a single Ignite user asking about compressing a
> > > > single
> > > > > > field. However, we have had requests to secure certain fields,
> e.g.
> > > > > > passwords.
> > > > > >
> > > > > > I personally do not think per-field compression is needed, unless
> > we
> > > > can
> > > > > > point out some concrete real life use cases.
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Anton,
> > > > > > >
> > > > > > > >> I thought that if there will storing compressed data in the
> > > > memory,
> > > > > > data
> > > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > > >
> > > > > > > In per-field compression case - yes.
> > > > > > >
> > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <[hidden email]
> >:
> > > > > > >
> > > > > > > > Guys, could you please help me.
> > > > > > > > I thought that if there will storing compressed data in the
> > > memory,
> > > > > > data
> > > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > > >
> > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > > [hidden email]
> > > > >:
> > > > > > > >
> > > > > > > > > Vladimir,
> > > > > > > > >
> > > > > > > > > The main problem which I'am trying to solve is storing data
> > in
> > > > > memory
> > > > > > > in
> > > > > > > > a
> > > > > > > > > compression form via Ignite.
> > > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > > >
> > > > > > > > > >> here the much simpler step would be to full
> > > > > > > > > compression on per-cache basis rather than dealing with
> > > > per-fields
> > > > > > > case.
> > > > > > > > >
> > > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > > Is it compatible with quering and indexing?
> > > > > > > > >
> > > > > > > > > >> In the end, if user would like to compress particular
> > field,
> > > > he
> > > > > > can
> > > > > > > > > always to it on his own
> > > > > > > > > I think we mustn't think in this way, if user need
> something
> > he
> > > > > > trying
> > > > > > > to
> > > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > > [hidden email]
> > > > >:
> > > > > > > > >
> > > > > > > > > > Igniters,
> > > > > > > > > >
> > > > > > > > > > Honestly I still do not see how to apply it gracefully
> this
> > > > > feature
> > > > > > > ti
> > > > > > > > > > Ignite. And overall approach to compress only particular
> > > fields
> > > > > > looks
> > > > > > > > > > overcomplicated to me. Remember, that our main use case
> is
> > an
> > > > > > > > application
> > > > > > > > > > without classes on the server. It means that any kind of
> > > > > > annotations
> > > > > > > > are
> > > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > > implemented
> > > > > > to
> > > > > > > > > > handle no-class case (e.g. how would build such an object
> > > > through
> > > > > > > > > > BinaryBuilder without a class?), and only then add
> > > annotations
> > > > as
> > > > > > > > > > convenient addition to more basic API.
> > > > > > > > > >
> > > > > > > > > > It seems to me that full implementation, which takes in
> > count
> > > > > > proper
> > > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > > compressed
> > > > > > > > fields,
> > > > > > > > > > changes to SQL, changes to binary protocol, and porting
> to
> > > .NET
> > > > > and
> > > > > > > > CPP,
> > > > > > > > > > will yield very complex solution with little value to the
> > > > > product.
> > > > > > > > > >
> > > > > > > > > > Instead, as I proposed earlier, it seems that we'd better
> > > start
> > > > > > with
> > > > > > > > the
> > > > > > > > > > problem we are trying to solve. Basically, compression
> > could
> > > > help
> > > > > > in
> > > > > > > > two
> > > > > > > > > > cases:
> > > > > > > > > > 1) Transmitting data over wire - it should be implemented
> > on
> > > > > > > > > communication
> > > > > > > > > > layer and should not affect binary serialization
> component
> > a
> > > > lot.
> > > > > > > > > > 2) Storing data in memory - here the much simpler step
> > would
> > > be
> > > > > to
> > > > > > > full
> > > > > > > > > > compression on per-cache basis rather than dealing with
> > > > > per-fields
> > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > In the end, if user would like to compress particular
> > field,
> > > he
> > > > > can
> > > > > > > > > always
> > > > > > > > > > to it on his own, and set already compressed field to our
> > > > > > > BinaryObject.
> > > > > > > > > >
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Valentin,
> > > > > > > > > > >
> > > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > > >
> > > > > > > > > > > You can see an example of Java class[3] that I used in
> my
> > > > > > > benchmark.
> > > > > > > > > > > For example:
> > > > > > > > > > > class Foo {
> > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > String data;
> > > > > > > > > > > }
> > > > > > > > > > > If user make decision to store the object in compressed
> > > form,
> > > > > he
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > > marshalling.
> > > > > > > > > > >
> > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > > [2] https://issues.apache.org/jira/browse/IGNITE-5226
> > > > > > > > > > > [3]
> > > > > > > > > > > https://github.com/daradurvs/ignite-compression/blob/
> > > > > > > > > > > master/src/main/java/ru/daradurvs/ignite/compression/
> > > > > > > > > model/Audit1F.java
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > >
> > > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > > >
> > > > > > > > > > > > Are there any ideas and/or prototypes for the API?
> Your
> > > > > design
> > > > > > > > > > > suggestions
> > > > > > > > > > > > seem to make sense, but I would like to see how it
> all
> > > this
> > > > > > will
> > > > > > > > like
> > > > > > > > > > > from
> > > > > > > > > > > > user's standpoint.
> > > > > > > > > > > >
> > > > > > > > > > > > -Val
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > > >
> > > > > > > > > > > > > We could provide opportunity of choose between CPU
> > > usage
> > > > > and
> > > > > > > > > MEM/NET
> > > > > > > > > > > > usage
> > > > > > > > > > > > > for users by compression some attributes of stored
> > > > objects.
> > > > > > > > > > > > > You have learned design, and it is possible to
> > localize
> > > > > > changes
> > > > > > > > in
> > > > > > > > > > > > > marshalling without performance affect and current
> > > > > > > functionality.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think, that it's usefull for our project and
> users.
> > > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > In short,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > > BinaryFieldAccessor
> > > > > > > > > > > which
> > > > > > > > > > > > > > manages its marshalling. It checks if the field
> is
> > > > marked
> > > > > > by
> > > > > > > > > > > annotation
> > > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > > representation
> > > > > > of
> > > > > > > > > field
> > > > > > > > > > > > (bytes
> > > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > > compressed
> > > > > > by
> > > > > > > > > types
> > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
> after
> > > this
> > > > > the
> > > > > > > > > > > compressed
> > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > array wiil be include in binary representation of
> > > whole
> > > > > > > object.
> > > > > > > > > > Note,
> > > > > > > > > > > > > > header of marshalled object will not be
> compressed.
> > > > > > > Compression
> > > > > > > > > > > > affected
> > > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Objects in IgniteCache is represented as
> > BinaryObject
> > > > > which
> > > > > > > is
> > > > > > > > > > > wrapper
> > > > > > > > > > > > > over
> > > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > > BinaryObject provides some usefull methods, which
> > are
> > > > > used
> > > > > > by
> > > > > > > > > > Ignite
> > > > > > > > > > > > > > systems.
> > > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > > method,
> > > > > > which
> > > > > > > > > > > > > deserializes
> > > > > > > > > > > > > > only field of object, without deserializing of
> > whole
> > > > > > object.
> > > > > > > > > > > > > > BinaryObject#field method during deserialization,
> > if
> > > > > meets
> > > > > > > the
> > > > > > > > > > > constant
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > compressed type, decompress this bytes array,
> then
> > > > > continue
> > > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > > as usual.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > > IgniteConfigurations,
> > > > > > > > > > > it
> > > > > > > > > > > > > > allows user to use own implementation of
> > compressor -
> > > > it
> > > > > is
> > > > > > > the
> > > > > > > > > > > > > requirement
> > > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like
> the
> > > idea
> > > > > of
> > > > > > > > > granting
> > > > > > > > > > > > this
> > > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > > In that case we can choose a compression
> algorithm
> > > > which
> > > > > we
> > > > > > > > will
> > > > > > > > > > > > provide
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > default and will move the interface to internals
> of
> > > > > binary
> > > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > > For this case I've prepared benchmarked, which
> I've
> > > > sent
> > > > > > > > earlier.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > > compression
> > > > > > > > ratio
> > > > > > > > > > and
> > > > > > > > > > > > good
> > > > > > > > > > > > > > throughput. It has implementation in Java, .NET
> and
> > > > C++,
> > > > > > and
> > > > > > > > has
> > > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> > Ignite
> > > > > > > > platforms.
> > > > > > > > > > > > > > You can look at an assessment of this algorithm
> in
> > my
> > > > > > > > benchmark's
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] https://issues.apache.org/
> > > jira/browse/IGNITE-3592
> > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > > [hidden email]
> > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Could You propose design of implementation in
> > > couple
> > > > of
> > > > > > > > > > sentences?
> > > > > > > > > > > > > > > So that we can estimate the completeness and
> > > > complexity
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Of course, the solution does not affect on
> > > existing
> > > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > > there is no changes if user not use the
> > > annotation
> > > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > > Only if user make decision to use compression
> > on
> > > > > > specific
> > > > > > > > > field
> > > > > > > > > > > or
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > of a class - in that case compression will be
> > > used
> > > > at
> > > > > > > > > > marshalling
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Is it possible to propose implementation
> that
> > > can
> > > > > be
> > > > > > > > > switched
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > > In this case it should not affect
> performance
> > > of
> > > > > > > current
> > > > > > > > > > > > solution.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I mean, that users should make decision
> what
> > is
> > > > > more
> > > > > > > > > > important
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > > May be they will be choose not all objects,
> > or
> > > > only
> > > > > > > some
> > > > > > > > > > > > attributes
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> > Daradur <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > > Provided solution allows reduce size of
> an
> > > > object
> > > > > > in
> > > > > > > > > > > > IgniteCache
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > cost of throughput reduction (small - in
> > some
> > > > > > cases),
> > > > > > > > it
> > > > > > > > > > > > depends
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > object which will be compressed and
> > > compression
> > > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > > I mean, we can make more effective use of
> > > > memory,
> > > > > > and
> > > > > > > > in
> > > > > > > > > > some
> > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > > (replication,
> > > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Especially, it will be particularly
> useful
> > > for
> > > > > > > object's
> > > > > > > > > > > fields
> > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > > effectively
> > > > > > > > > > compressed.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон Чураев <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> > please
> > > > > > > provide a
> > > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > > Daradur <
> > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > literal
> > > > > text)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > > depending on size of compressed
> field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > > object
> > > > > > size,
> > > > > > > > with
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > badly
> > > > > > > compressed
> > > > > > > > > > > > character
> > > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios of
> > > using
> > > > > > > > different
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > algrithms depending on size of
> > compressed
> > > > > > field.
> > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size of
> > > > objects
> > > > > > > > > depending
> > > > > > > > > > > on
> > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput of
> > the
> > > > > "put"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput of
> > the
> > > > > "get"
> > > > > > > > > > operation
> > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > > Setrakyan
> > > > > <
> > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > > interpret
> > > > > the
> > > > > > > > > graphs?
> > > > > > > > > > > What
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > > Vyacheslav
> > > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I've prepared some benchmarking.
> > > > Results
> > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > And I've prepared the evaluation
> in
> > > the
> > > > > > form
> > > > > > > of
> > > > > > > > > > > > diagrams
> > > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > I hope that helps to interest the
> > > > > community
> > > > > > > and
> > > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > > master/src/main/resources/result
> > > > > > > > > > > > > > > > > > > > > > [2]
> https://drive.google.com/file/
> > d/
> > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> > Vyacheslav
> > > > > > Daradur
> > > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > > Vyacheslav
> > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show
> my
> > > > idea.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> > copied
> > > > > > existing
> > > > > > > > > tests
> > > > > > > > > > > and
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> It means fields which will be
> > > marked
> > > > > by
> > > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling via
> > > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> This solution has no effect on
> > > > > existing
> > > > > > > data
> > > > > > > > > or
> > > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> > thougths.
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > > Vyacheslav
> > > > > > > > Daradur
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
> want
> > to
> > > > > show
> > > > > > > it.
> > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
> discuss
> > on
> > > > > > > example.
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > > Dmitriy
> > > > > > > > Setrakyan
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
> premature
> > to
> > > > > > > provide a
> > > > > > > > > PR
> > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > > Please
> > > > > > allow
> > > > > > > > some
> > > > > > > > > > > time
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at 6:36
> > AM,
> > > > > > > > Vyacheslav
> > > > > > > > > > > > Daradur
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > > described
> > > > > > > > solution
> > > > > > > > > in
> > > > > > > > > > > > > couple
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05 GMT+03:00
> > > > > > Vyacheslav
> > > > > > > > > > Daradur
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is released.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > > discussion
> > > > > > about
> > > > > > > a
> > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> > only
> > > > one
> > > > > > > > solution
> > > > > > > > > > > which
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > > per-objects-field
> > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> > means
> > > > > that
> > > > > > > > > metadata
> > > > > > > > > > > (a
> > > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > > serialized
> > > > > > > values
> > > > > > > > of
> > > > > > > > > > an
> > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
> compressed.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have some
> > > > > > contentious
> > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > > primitives
> > > > > > and
> > > > > > > > > short
> > > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible
> to
> > > use
> > > > > > > > > compression
> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > > annotation,
> > > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> > marking
> > > > > > fields
> > > > > > > to
> > > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already
> have
> > > > ready
> > > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
> GMT+03:00
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss about
> > > > public
> > > > > > API
> > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to add
> > > some a
> > > > > > > > configure
> > > > > > > > > > > > entity
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain the
> > > > > Compressor
> > > > > > > > > > interface
> > > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > > BinaryMarshaller
> > > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after marshalling.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> > GMT+03:00
> > > > > Alexey
> > > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > > discussion
> > > > > > > [1]
> > > > > > > > > > about
> > > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember
> we
> > > > agreed
> > > > > > to
> > > > > > > > add
> > > > > > > > > > only
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to inject
> > > some
> > > > > sort
> > > > > > > of
> > > > > > > > > > custom
> > > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > http://apache-ignite-developer
> > > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > ompression-in-Ignite-2-0-
> > > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017
> at
> > > 2:19
> > > > > PM,
> > > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> > this
> > > > > task.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind of
> > > > > pluggable
> > > > > > > > > > > compression
> > > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > > https://issues.apache.org/
> > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
> solution
> > > on
> > > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > > discussion
> > > > of
> > > > > > > task
> > > > > > > > > > goals
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
> that,
> > > the
> > > > > main
> > > > > > > > goal
> > > > > > > > > of
> > > > > > > > > > > > this
> > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I need
> > from
> > > > > > Ignite
> > > > > > > as
> > > > > > > > > its
> > > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more
> data
> > > on
> > > > > same
> > > > > > > > > servers
> > > > > > > > > > > at
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > > possibility
> > > > > > of
> > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message in
> > > > > context:
> > > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the Apache
> > > > Ignite
> > > > > > > > > Developers
> > > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
> Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best Regards, Anton Churaev
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best Regards, Vyacheslav
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best Regards, Vyacheslav
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Best Regards, Vyacheslav
> > >
> >
>
>
>
> --
>
> Best Regards, Anton Churaev
>



--
Sergey Kozlov
GridGain Systems
www.gridgain.com
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Hi Igniters.

Vladimir, I want to propose another design of an implementation of the
per-field compression.

1) We will add new step in the method prepareForCache (for example) of
CacheObject, or in GridCacheMapEntry.

At the step, after marshalling of an object, we will compress fields of the
object which described in advance.
User will describe class fields which he wants to compess in an another
entity like Metadata.

For compression, we will introduce another entity, for example
CompressionProcessor, which will work with bytes array (marshalled object).
The entity will read bytes array of described fields, compress it and
rewrite binary representation of the whole object.
After processing the object will be put in the cache.

In this case design not to relate to binary infrastructure.
But there is big overhead to heap-memory for the buffer.

2) Another solution is to compress bytes array of whole object on copying
to off-heap.
But, in this case I don't understand yet, how to provide support of
querying and indexing.


2017-06-09 11:21 GMT+03:00 Sergey Kozlov <[hidden email]>:

> Hi
>
> * "Per-field compression" is applicable for huge BLOB fields and will
> impose the restrictions like unable ot index such fields, slower getting
> data, potential OOM issues if compression ration is too high.
> But for some cases it makes sense
>
> On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев <[hidden email]>
> wrote:
>
> > Seems that Dmitry is referring to transparent data encryption. It is used
> > throughout the whale database industry.
> >
> > 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> >
> > > Dima,
> > >
> > > Encryption of certain fields is as bad as compression. First, it is a
> > huge
> > > change, which makes already complex binary protocol even more complex.
> > > Second, it have to be ported to CPP, .NET platforms, as well as to JDBC
> > and
> > > ODBC.
> > > Last, but the most important - this is not our headache to encrypt
> > > sensitive data. This is user responsibility. Nobody in a sane mind will
> > > store passwords in plain form. Instead, user should encrypt it on his
> > own,
> > > choosing proper encryption parameters - algorithms, key lengths, salts,
> > > etc.. How are you going to expose this in API or configuration?
> > >
> > > We should not implement data encryption on binary level, this is out of
> > > question. Encryption should be implemented on application level (user
> > > efforts), transport layer (SSL - we already have it), and possibly on
> > > disk-level (there are tools for this already).
> > >
> > >
> > > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <
> [hidden email]>
> > > wrote:
> > >
> > > > >> which is much less useful.
> > > > I note, in some cases there is profit more than twice per size of an
> > > > object.
> > > >
> > > > >> Would it be possible to change your implementation to handle the
> > > > encryption instead?
> > > > Yes, of cource, there's not much difference between compression and
> > > > encryption, including in my implementation of per-field-compression.
> > > >
> > > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > > >
> > > > > Vyacheslav,
> > > > >
> > > > > When this feature started out as data compression in Ignite, it
> > sounded
> > > > > very useful. Now it is unfolding as a per-field compression, which
> is
> > > > much
> > > > > less useful. In fact, it is questionable whether it is useful at
> all.
> > > The
> > > > > fact that this feature is implemented does not make it mandatory
> for
> > > the
> > > > > community to accept it.
> > > > >
> > > > > However, as I mentioned before, per-field encryption is very
> useful,
> > as
> > > > it
> > > > > would allow users automatically encrypt certain sensitive fields,
> > like
> > > > > passwords, credit card numbers, etc. There is not much conceptual
> > > > > difference between compressing a field vs encrypting a field. Would
> > it
> > > be
> > > > > possible to change your implementation to handle the encryption
> > > instead?
> > > > >
> > > > > D.
> > > > >
> > > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
> > > [hidden email]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Guys, I want to be clear:
> > > > > > * "Per-field compression" design is the result of a research of
> the
> > > > > binary
> > > > > > infrastructure of Ignite and some other its places (querying,
> > > indexing,
> > > > > > etc.)
> > > > > > * Full-compression of object will be more effective, but in this
> > case
> > > > > there
> > > > > > is no capability with querying and indexing (or there is large
> > > overhead
> > > > > by
> > > > > > way of decompressing of full object (or caches pages) on demand)
> > > > > > * "Per-field compression" is a one of ways to implement the
> > > compression
> > > > > > feature
> > > > > >
> > > > > > I'm new to Ignite also I can be mistaken in some things.
> > > > > > Last 3-4 month I've tryed to start dicussion about a design, but
> > > nobody
> > > > > > answers nothing (except Dmitry and Valentin who was interested
> how
> > it
> > > > > > works).
> > > > > > But I understand that this is community and nobody is obliged to
> > > > anybody.
> > > > > >
> > > > > > There are strong Ignite experts.
> > > > > > If they can help me and community with a design of the
> compression
> > > > > feature
> > > > > > it will be great.
> > > > > > At the moment I have a desire and time to be engaged in
> development
> > > of
> > > > > > compression feature in Ignite.
> > > > > > Let's use this opportunity :)
> > > > > >
> > > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <
> [hidden email]
> > >:
> > > > > >
> > > > > > > Igniters,
> > > > > > >
> > > > > > > I have never seen a single Ignite user asking about
> compressing a
> > > > > single
> > > > > > > field. However, we have had requests to secure certain fields,
> > e.g.
> > > > > > > passwords.
> > > > > > >
> > > > > > > I personally do not think per-field compression is needed,
> unless
> > > we
> > > > > can
> > > > > > > point out some concrete real life use cases.
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
> > > > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Anton,
> > > > > > > >
> > > > > > > > >> I thought that if there will storing compressed data in
> the
> > > > > memory,
> > > > > > > data
> > > > > > > > >> will transmit over wire in compression too. Is it right?
> > > > > > > >
> > > > > > > > In per-field compression case - yes.
> > > > > > > >
> > > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <
> [hidden email]
> > >:
> > > > > > > >
> > > > > > > > > Guys, could you please help me.
> > > > > > > > > I thought that if there will storing compressed data in the
> > > > memory,
> > > > > > > data
> > > > > > > > > will transmit over wire in compression too. Is it right?
> > > > > > > > >
> > > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
> > > > [hidden email]
> > > > > >:
> > > > > > > > >
> > > > > > > > > > Vladimir,
> > > > > > > > > >
> > > > > > > > > > The main problem which I'am trying to solve is storing
> data
> > > in
> > > > > > memory
> > > > > > > > in
> > > > > > > > > a
> > > > > > > > > > compression form via Ignite.
> > > > > > > > > > The main goal is using memory more effectivelly.
> > > > > > > > > >
> > > > > > > > > > >> here the much simpler step would be to full
> > > > > > > > > > compression on per-cache basis rather than dealing with
> > > > > per-fields
> > > > > > > > case.
> > > > > > > > > >
> > > > > > > > > > Please explain your idea. Compess data by memory-page?
> > > > > > > > > > Is it compatible with quering and indexing?
> > > > > > > > > >
> > > > > > > > > > >> In the end, if user would like to compress particular
> > > field,
> > > > > he
> > > > > > > can
> > > > > > > > > > always to it on his own
> > > > > > > > > > I think we mustn't think in this way, if user need
> > something
> > > he
> > > > > > > trying
> > > > > > > > to
> > > > > > > > > > choose a tool which has this feature OOTB.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
> > > > [hidden email]
> > > > > >:
> > > > > > > > > >
> > > > > > > > > > > Igniters,
> > > > > > > > > > >
> > > > > > > > > > > Honestly I still do not see how to apply it gracefully
> > this
> > > > > > feature
> > > > > > > > ti
> > > > > > > > > > > Ignite. And overall approach to compress only
> particular
> > > > fields
> > > > > > > looks
> > > > > > > > > > > overcomplicated to me. Remember, that our main use case
> > is
> > > an
> > > > > > > > > application
> > > > > > > > > > > without classes on the server. It means that any kind
> of
> > > > > > > annotations
> > > > > > > > > are
> > > > > > > > > > > inapplicable. To be more precise: proper API should be
> > > > > > implemented
> > > > > > > to
> > > > > > > > > > > handle no-class case (e.g. how would build such an
> object
> > > > > through
> > > > > > > > > > > BinaryBuilder without a class?), and only then add
> > > > annotations
> > > > > as
> > > > > > > > > > > convenient addition to more basic API.
> > > > > > > > > > >
> > > > > > > > > > > It seems to me that full implementation, which takes in
> > > count
> > > > > > > proper
> > > > > > > > > > > "classless" API, changes to binary metadata to reflect
> > > > > compressed
> > > > > > > > > fields,
> > > > > > > > > > > changes to SQL, changes to binary protocol, and porting
> > to
> > > > .NET
> > > > > > and
> > > > > > > > > CPP,
> > > > > > > > > > > will yield very complex solution with little value to
> the
> > > > > > product.
> > > > > > > > > > >
> > > > > > > > > > > Instead, as I proposed earlier, it seems that we'd
> better
> > > > start
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > > problem we are trying to solve. Basically, compression
> > > could
> > > > > help
> > > > > > > in
> > > > > > > > > two
> > > > > > > > > > > cases:
> > > > > > > > > > > 1) Transmitting data over wire - it should be
> implemented
> > > on
> > > > > > > > > > communication
> > > > > > > > > > > layer and should not affect binary serialization
> > component
> > > a
> > > > > lot.
> > > > > > > > > > > 2) Storing data in memory - here the much simpler step
> > > would
> > > > be
> > > > > > to
> > > > > > > > full
> > > > > > > > > > > compression on per-cache basis rather than dealing with
> > > > > > per-fields
> > > > > > > > > case.
> > > > > > > > > > >
> > > > > > > > > > > In the end, if user would like to compress particular
> > > field,
> > > > he
> > > > > > can
> > > > > > > > > > always
> > > > > > > > > > > to it on his own, and set already compressed field to
> our
> > > > > > > > BinaryObject.
> > > > > > > > > > >
> > > > > > > > > > > Vladimir.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Valentin,
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, I have the prototype[1][2]
> > > > > > > > > > > >
> > > > > > > > > > > > You can see an example of Java class[3] that I used
> in
> > my
> > > > > > > > benchmark.
> > > > > > > > > > > > For example:
> > > > > > > > > > > > class Foo {
> > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > String data;
> > > > > > > > > > > > }
> > > > > > > > > > > > If user make decision to store the object in
> compressed
> > > > form,
> > > > > > he
> > > > > > > > can
> > > > > > > > > > use
> > > > > > > > > > > > the annotation @BinaryCompression as shown above.
> > > > > > > > > > > > It means annotated field 'data' will be compressed at
> > > > > > > marshalling.
> > > > > > > > > > > >
> > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
> > > > > > > > > > > > [2] https://issues.apache.org/
> jira/browse/IGNITE-5226
> > > > > > > > > > > > [3]
> > > > > > > > > > > > https://github.com/daradurvs/
> ignite-compression/blob/
> > > > > > > > > > > > master/src/main/java/ru/
> daradurvs/ignite/compression/
> > > > > > > > > > model/Audit1F.java
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > >
> > > > > > > > > > > > > Vyacheslav, Anton,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Are there any ideas and/or prototypes for the API?
> > Your
> > > > > > design
> > > > > > > > > > > > suggestions
> > > > > > > > > > > > > seem to make sense, but I would like to see how it
> > all
> > > > this
> > > > > > > will
> > > > > > > > > like
> > > > > > > > > > > > from
> > > > > > > > > > > > > user's standpoint.
> > > > > > > > > > > > >
> > > > > > > > > > > > > -Val
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
> > > > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Vyacheslav, correct me if something wrong
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > We could provide opportunity of choose between
> CPU
> > > > usage
> > > > > > and
> > > > > > > > > > MEM/NET
> > > > > > > > > > > > > usage
> > > > > > > > > > > > > > for users by compression some attributes of
> stored
> > > > > objects.
> > > > > > > > > > > > > > You have learned design, and it is possible to
> > > localize
> > > > > > > changes
> > > > > > > > > in
> > > > > > > > > > > > > > marshalling without performance affect and
> current
> > > > > > > > functionality.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think, that it's usefull for our project and
> > users.
> > > > > > > > > > > > > > Community, what do you think about this proposal?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
> > > > > > > > > [hidden email]
> > > > > > > > > > >:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > In short,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > During marshalling a fields is represented as
> > > > > > > > > BinaryFieldAccessor
> > > > > > > > > > > > which
> > > > > > > > > > > > > > > manages its marshalling. It checks if the field
> > is
> > > > > marked
> > > > > > > by
> > > > > > > > > > > > annotation
> > > > > > > > > > > > > > > @BinaryCompression, in that case - binary
> > > > > representation
> > > > > > > of
> > > > > > > > > > field
> > > > > > > > > > > > > (bytes
> > > > > > > > > > > > > > > array) will be compressed. It will be marked as
> > > > > > compressed
> > > > > > > by
> > > > > > > > > > types
> > > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
> > after
> > > > this
> > > > > > the
> > > > > > > > > > > > compressed
> > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > array wiil be include in binary representation
> of
> > > > whole
> > > > > > > > object.
> > > > > > > > > > > Note,
> > > > > > > > > > > > > > > header of marshalled object will not be
> > compressed.
> > > > > > > > Compression
> > > > > > > > > > > > > affected
> > > > > > > > > > > > > > > only object's field representation.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Objects in IgniteCache is represented as
> > > BinaryObject
> > > > > > which
> > > > > > > > is
> > > > > > > > > > > > wrapper
> > > > > > > > > > > > > > over
> > > > > > > > > > > > > > > bytes array of marshalled object.
> > > > > > > > > > > > > > > BinaryObject provides some usefull methods,
> which
> > > are
> > > > > > used
> > > > > > > by
> > > > > > > > > > > Ignite
> > > > > > > > > > > > > > > systems.
> > > > > > > > > > > > > > > For example, the Queries use BinaryObject#field
> > > > method,
> > > > > > > which
> > > > > > > > > > > > > > deserializes
> > > > > > > > > > > > > > > only field of object, without deserializing of
> > > whole
> > > > > > > object.
> > > > > > > > > > > > > > > BinaryObject#field method during
> deserialization,
> > > if
> > > > > > meets
> > > > > > > > the
> > > > > > > > > > > > constant
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > compressed type, decompress this bytes array,
> > then
> > > > > > continue
> > > > > > > > > > > > > unmarshalling
> > > > > > > > > > > > > > > as usual.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Now, I introduced the Compressor interface in
> > > > > > > > > > IgniteConfigurations,
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > allows user to use own implementation of
> > > compressor -
> > > > > it
> > > > > > is
> > > > > > > > the
> > > > > > > > > > > > > > requirement
> > > > > > > > > > > > > > > in the task[1].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like
> > the
> > > > idea
> > > > > > of
> > > > > > > > > > granting
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > opportunity to the user.
> > > > > > > > > > > > > > > In that case we can choose a compression
> > algorithm
> > > > > which
> > > > > > we
> > > > > > > > > will
> > > > > > > > > > > > > provide
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > default and will move the interface to
> internals
> > of
> > > > > > binary
> > > > > > > > > > > > > > infractructure.
> > > > > > > > > > > > > > > For this case I've prepared benchmarked, which
> > I've
> > > > > sent
> > > > > > > > > earlier.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
> > > > > > compression
> > > > > > > > > ratio
> > > > > > > > > > > and
> > > > > > > > > > > > > good
> > > > > > > > > > > > > > > throughput. It has implementation in Java, .NET
> > and
> > > > > C++,
> > > > > > > and
> > > > > > > > > has
> > > > > > > > > > > > > > > ASF-friendly license, we can use it in the all
> > > Ignite
> > > > > > > > > platforms.
> > > > > > > > > > > > > > > You can look at an assessment of this algorithm
> > in
> > > my
> > > > > > > > > benchmark's
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1] https://issues.apache.org/
> > > > jira/browse/IGNITE-3592
> > > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
> > > > > > > > [hidden email]
> > > > > > > > > >:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Looks good for me.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Could You propose design of implementation in
> > > > couple
> > > > > of
> > > > > > > > > > > sentences?
> > > > > > > > > > > > > > > > So that we can estimate the completeness and
> > > > > complexity
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav
> Daradur <
> > > > > > > > > > > [hidden email]
> > > > > > > > > > > > >:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Anton,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Of course, the solution does not affect on
> > > > existing
> > > > > > > > > > > > > implementation. I
> > > > > > > > > > > > > > > > mean,
> > > > > > > > > > > > > > > > > there is no changes if user not use the
> > > > annotation
> > > > > > > > > > > > > > @BinaryCompression.
> > > > > > > > > > > > > > > > (no
> > > > > > > > > > > > > > > > > performance changes)
> > > > > > > > > > > > > > > > > Only if user make decision to use
> compression
> > > on
> > > > > > > specific
> > > > > > > > > > field
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > of a class - in that case compression will
> be
> > > > used
> > > > > at
> > > > > > > > > > > marshalling
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > relation to annotated fields.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
> > > > > > > > > > [hidden email]
> > > > > > > > > > > >:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Vyacheslav,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Is it possible to propose implementation
> > that
> > > > can
> > > > > > be
> > > > > > > > > > switched
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > on-demand?
> > > > > > > > > > > > > > > > > > In this case it should not affect
> > performance
> > > > of
> > > > > > > > current
> > > > > > > > > > > > > solution.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I mean, that users should make decision
> > what
> > > is
> > > > > > more
> > > > > > > > > > > important
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > them:
> > > > > > > > > > > > > > > > > > throutput or memory/net usage.
> > > > > > > > > > > > > > > > > > May be they will be choose not all
> objects,
> > > or
> > > > > only
> > > > > > > > some
> > > > > > > > > > > > > attributes
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > objects for compress.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
> > > Daradur <
> > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Conclusion:
> > > > > > > > > > > > > > > > > > > Provided solution allows reduce size of
> > an
> > > > > object
> > > > > > > in
> > > > > > > > > > > > > IgniteCache
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > cost of throughput reduction (small -
> in
> > > some
> > > > > > > cases),
> > > > > > > > > it
> > > > > > > > > > > > > depends
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > part
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > object which will be compressed and
> > > > compression
> > > > > > > > > > algorithm.
> > > > > > > > > > > > > > > > > > > I mean, we can make more effective use
> of
> > > > > memory,
> > > > > > > and
> > > > > > > > > in
> > > > > > > > > > > some
> > > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
> > > > > (replication,
> > > > > > > > > > > > rebalancing)
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Especially, it will be particularly
> > useful
> > > > for
> > > > > > > > object's
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
> > > > > effectively
> > > > > > > > > > > compressed.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон
> Чураев <
> > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
> > > please
> > > > > > > > provide a
> > > > > > > > > > > > > > conclusions
> > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00 Vyacheslav
> > > > > Daradur <
> > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Dmitry,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Excel-pages:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" - shows
> > > > object
> > > > > > > size,
> > > > > > > > > with
> > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > > literal
> > > > > > text)
> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios
> of
> > > > using
> > > > > > > > > different
> > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > algrithms
> > > > > > > > > > > > > > > > > > > > > depending on size of compressed
> > field.
> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size
> of
> > > > > objects
> > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" - shows
> > > > object
> > > > > > > size,
> > > > > > > > > with
> > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
> > > badly
> > > > > > > > compressed
> > > > > > > > > > > > > character
> > > > > > > > > > > > > > > > > > sequence)
> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression ratios
> of
> > > > using
> > > > > > > > > different
> > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > algrithms depending on size of
> > > compressed
> > > > > > > field.
> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of size
> of
> > > > > objects
> > > > > > > > > > depending
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > sizes
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time
> of
> > > the
> > > > > > "put"
> > > > > > > > > > > operation
> > > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput
> of
> > > the
> > > > > > "put"
> > > > > > > > > > > operation
> > > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time
> of
> > > the
> > > > > > "get"
> > > > > > > > > > > operation
> > > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput
> of
> > > the
> > > > > > "get"
> > > > > > > > > > > operation
> > > > > > > > > > > > > > > > depending
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > size
> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
> > > > > Setrakyan
> > > > > > <
> > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
> > > > interpret
> > > > > > the
> > > > > > > > > > graphs?
> > > > > > > > > > > > What
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > looking
> > > > > > > > > > > > > > > > > > > > > > at?
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
> > > > > Vyacheslav
> > > > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > I've prepared some
> benchmarking.
> > > > > Results
> > > > > > > [1].
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > And I've prepared the
> evaluation
> > in
> > > > the
> > > > > > > form
> > > > > > > > of
> > > > > > > > > > > > > diagrams
> > > > > > > > > > > > > > > [2].
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > I hope that helps to interest
> the
> > > > > > community
> > > > > > > > and
> > > > > > > > > > > > > > > accelerates a
> > > > > > > > > > > > > > > > > > > > reaction
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > this improvment :)
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
> > > > > > > > > > > > ignite-compression/tree/
> > > > > > > > > > > > > > > > > > > > > > > master/src/main/resources/
> result
> > > > > > > > > > > > > > > > > > > > > > > [2]
> > https://drive.google.com/file/
> > > d/
> > > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
> > > > > > > > > > > > > > > > > > > > view
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
> > > Vyacheslav
> > > > > > > Daradur
> > > > > > > > <
> > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
> > > > Vyacheslav
> > > > > > > > > Daradur <
> > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to show
> > my
> > > > > idea.
> > > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > ignite/pull/1951/files
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
> > > copied
> > > > > > > existing
> > > > > > > > > > tests
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > annotated
> > > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > > >> testing data.
> > > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
> > > > > > > > > > > > > > ignite/pull/1951/files#diff-
> > > > > > > > > > > > > > > > > c19a9d
> > > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> It means fields which will
> be
> > > > marked
> > > > > > by
> > > > > > > > > > > > > > > @BinaryCompression
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling
> via
> > > > > > > > > > BinaryMarshaller.
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> This solution has no effect
> on
> > > > > > existing
> > > > > > > > data
> > > > > > > > > > or
> > > > > > > > > > > > > > project
> > > > > > > > > > > > > > > > > > > > > architecture.
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
> > > thougths.
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
> > > > > Vyacheslav
> > > > > > > > > Daradur
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
> > want
> > > to
> > > > > > show
> > > > > > > > it.
> > > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
> > discuss
> > > on
> > > > > > > > example.
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
> > > > Dmitriy
> > > > > > > > > Setrakyan
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
> > premature
> > > to
> > > > > > > > provide a
> > > > > > > > > > PR
> > > > > > > > > > > > > > without
> > > > > > > > > > > > > > > > > > getting
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > > > >>>> community
> > > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev list.
> > > > Please
> > > > > > > allow
> > > > > > > > > some
> > > > > > > > > > > > time
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > community
> > > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > >>>> respond.
> > > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> D.
> > > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at
> 6:36
> > > AM,
> > > > > > > > > Vyacheslav
> > > > > > > > > > > > > Daradur
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
> > > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
> > > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
> > > > > > > > > > > > > > > https://issues.apache.org/jira
> > > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
> > > > described
> > > > > > > > > solution
> > > > > > > > > > in
> > > > > > > > > > > > > > couple
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > days.
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05
> GMT+03:00
> > > > > > > Vyacheslav
> > > > > > > > > > > Daradur
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > >:
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is
> released.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
> > > > discussion
> > > > > > > about
> > > > > > > > a
> > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I found
> > > only
> > > > > one
> > > > > > > > > solution
> > > > > > > > > > > > which
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > compatible
> > > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > > >>>> > querying
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
> > > > > > > > > per-objects-field
> > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields compression
> > > means
> > > > > > that
> > > > > > > > > > metadata
> > > > > > > > > > > > (a
> > > > > > > > > > > > > > > > header)
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > > > > > > >>>> won't
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
> > > > serialized
> > > > > > > > values
> > > > > > > > > of
> > > > > > > > > > > an
> > > > > > > > > > > > > > object
> > > > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > > > (in
> > > > > > > > > > > > > > > > > > > > > > > bytes
> > > > > > > > > > > > > > > > > > > > > > > >>>> array
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
> > compressed.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have
> some
> > > > > > > contentious
> > > > > > > > > > > issues:
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
> > > > > primitives
> > > > > > > and
> > > > > > > > > > short
> > > > > > > > > > > > > > arrays -
> > > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > > isn't
> > > > > > > > > > > > > > > > > > > > > > > >>>> sense to
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no possible
> > to
> > > > use
> > > > > > > > > > compression
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > java-predefined
> > > > > > > > > > > > > > > > > > > > > > > >>>> types;
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
> > > > annotation,
> > > > > > > > > > > > > > @IgniteCompression -
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > > > > > >>>> which can
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
> > > marking
> > > > > > > fields
> > > > > > > > to
> > > > > > > > > > > > > compress.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already
> > have
> > > > > ready
> > > > > > > > > design?
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
> > GMT+03:00
> > > > > > > > Vyacheslav
> > > > > > > > > > > > Daradur
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss
> about
> > > > > public
> > > > > > > API
> > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to
> add
> > > > some a
> > > > > > > > > configure
> > > > > > > > > > > > > entity
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain
> the
> > > > > > Compressor
> > > > > > > > > > > interface
> > > > > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide a
> > > > > > > > > BinaryMarshaller
> > > > > > > > > > > > > > decorator,
> > > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > > >>>> compress
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after
> marshalling.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
> > > GMT+03:00
> > > > > > Alexey
> > > > > > > > > > > > Kuznetsov <
> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read initial
> > > > > > discussion
> > > > > > > > [1]
> > > > > > > > > > > about
> > > > > > > > > > > > > > > > > compression?
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I remember
> > we
> > > > > agreed
> > > > > > > to
> > > > > > > > > add
> > > > > > > > > > > only
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > "top-level"
> > > > > > > > > > > > > > > > > > > > > > API
> > > > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > > > >>>> > order
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to
> inject
> > > > some
> > > > > > sort
> > > > > > > > of
> > > > > > > > > > > custom
> > > > > > > > > > > > > > > > > > compression.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > http://apache-ignite-developer
> > > > > > > > > > > > > > > s.2346864.n4.nabble
> > > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > > > > > > > com/Data-c
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > ompression-in-Ignite-2-0-
> > > > > > > > > td10099.html
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10, 2017
> > at
> > > > 2:19
> > > > > > PM,
> > > > > > > > > > > > daradurvs <
> > > > > > > > > > > > > > > > > > > > > > [hidden email]
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested in
> > > this
> > > > > > task.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some kind
> of
> > > > > > pluggable
> > > > > > > > > > > > compression
> > > > > > > > > > > > > > SPI
> > > > > > > > > > > > > > > > > > support
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
> > > > > https://issues.apache.org/
> > > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
> > solution
> > > > on
> > > > > > > > > > > > > > > > BinaryMarshaller-level,
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > > reviewer
> > > > > > > > > > > > > > > > > > > > > > > >>>> has
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
> > > > discussion
> > > > > of
> > > > > > > > task
> > > > > > > > > > > goals
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > solution
> > > > > > > > > > > > > > > > > > > > > design.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
> > that,
> > > > the
> > > > > > main
> > > > > > > > > goal
> > > > > > > > > > of
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > task
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > store
> > > > > > > > > > > > > > > > > > > > > > > >>>> data in
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I
> need
> > > from
> > > > > > > Ignite
> > > > > > > > as
> > > > > > > > > > its
> > > > > > > > > > > > > user.
> > > > > > > > > > > > > > > > > > > Compression
> > > > > > > > > > > > > > > > > > > > > > > >>>> provides
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more
> > data
> > > > on
> > > > > > same
> > > > > > > > > > servers
> > > > > > > > > > > > at
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > cost
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
> > > > > > possibility
> > > > > > > of
> > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > > compression
> > > > > > > > > > > > > > > > > > > > > > > >>>> at the
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this message
> in
> > > > > > context:
> > > > > > > > > > > > > > > > > http://apache-ignite-
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > developers.2346864.n4.nabble.
> > > > > > > > > > > > > > > > > > com/Data-compression-in-
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > Ignite-2-0-tp10099p16317.html
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the
> Apache
> > > > > Ignite
> > > > > > > > > > Developers
> > > > > > > > > > > > > > mailing
> > > > > > > > > > > > > > > > > list
> > > > > > > > > > > > > > > > > > > > > archive
> > > > > > > > > > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
> > Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards,
> Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>> > --
> > > > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>>> >
> > > > > > > > > > > > > > > > > > > > > > > >>>>
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>> --
> > > > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>>
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >> --
> > > > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best Regards, Vyacheslav
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > >
> > > > > > > > > Best Regards, Anton Churaev
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best Regards, Vyacheslav
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best Regards, Vyacheslav
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best Regards, Vyacheslav
> > > >
> > >
> >
> >
> >
> > --
> >
> > Best Regards, Anton Churaev
> >
>
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>



--
Best Regards, Vyacheslav
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Hi Igniters!

I'd like to continue developing and discussing about compression in Ignite.

Vladimir, could you propose a design of compression feature in Ignite, that
suits you?

2017-06-15 16:13 GMT+03:00 Vyacheslav Daradur <[hidden email]>:

> Hi Igniters.
>
> Vladimir, I want to propose another design of an implementation of the
> per-field compression.
>
> 1) We will add new step in the method prepareForCache (for example) of
> CacheObject, or in GridCacheMapEntry.
>
> At the step, after marshalling of an object, we will compress fields of
> the object which described in advance.
> User will describe class fields which he wants to compess in an another
> entity like Metadata.
>
> For compression, we will introduce another entity, for example
> CompressionProcessor, which will work with bytes array (marshalled object).
> The entity will read bytes array of described fields, compress it and
> rewrite binary representation of the whole object.
> After processing the object will be put in the cache.
>
> In this case design not to relate to binary infrastructure.
> But there is big overhead to heap-memory for the buffer.
>
> 2) Another solution is to compress bytes array of whole object on copying
> to off-heap.
> But, in this case I don't understand yet, how to provide support of
> querying and indexing.
>
>
> 2017-06-09 11:21 GMT+03:00 Sergey Kozlov <[hidden email]>:
>
>> Hi
>>
>> * "Per-field compression" is applicable for huge BLOB fields and will
>> impose the restrictions like unable ot index such fields, slower getting
>> data, potential OOM issues if compression ration is too high.
>> But for some cases it makes sense
>>
>> On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев <[hidden email]>
>> wrote:
>>
>> > Seems that Dmitry is referring to transparent data encryption. It is
>> used
>> > throughout the whale database industry.
>> >
>> > 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>> >
>> > > Dima,
>> > >
>> > > Encryption of certain fields is as bad as compression. First, it is a
>> > huge
>> > > change, which makes already complex binary protocol even more complex.
>> > > Second, it have to be ported to CPP, .NET platforms, as well as to
>> JDBC
>> > and
>> > > ODBC.
>> > > Last, but the most important - this is not our headache to encrypt
>> > > sensitive data. This is user responsibility. Nobody in a sane mind
>> will
>> > > store passwords in plain form. Instead, user should encrypt it on his
>> > own,
>> > > choosing proper encryption parameters - algorithms, key lengths,
>> salts,
>> > > etc.. How are you going to expose this in API or configuration?
>> > >
>> > > We should not implement data encryption on binary level, this is out
>> of
>> > > question. Encryption should be implemented on application level (user
>> > > efforts), transport layer (SSL - we already have it), and possibly on
>> > > disk-level (there are tools for this already).
>> > >
>> > >
>> > > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <
>> [hidden email]>
>> > > wrote:
>> > >
>> > > > >> which is much less useful.
>> > > > I note, in some cases there is profit more than twice per size of an
>> > > > object.
>> > > >
>> > > > >> Would it be possible to change your implementation to handle the
>> > > > encryption instead?
>> > > > Yes, of cource, there's not much difference between compression and
>> > > > encryption, including in my implementation of per-field-compression.
>> > > >
>> > > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]
>> >:
>> > > >
>> > > > > Vyacheslav,
>> > > > >
>> > > > > When this feature started out as data compression in Ignite, it
>> > sounded
>> > > > > very useful. Now it is unfolding as a per-field compression,
>> which is
>> > > > much
>> > > > > less useful. In fact, it is questionable whether it is useful at
>> all.
>> > > The
>> > > > > fact that this feature is implemented does not make it mandatory
>> for
>> > > the
>> > > > > community to accept it.
>> > > > >
>> > > > > However, as I mentioned before, per-field encryption is very
>> useful,
>> > as
>> > > > it
>> > > > > would allow users automatically encrypt certain sensitive fields,
>> > like
>> > > > > passwords, credit card numbers, etc. There is not much conceptual
>> > > > > difference between compressing a field vs encrypting a field.
>> Would
>> > it
>> > > be
>> > > > > possible to change your implementation to handle the encryption
>> > > instead?
>> > > > >
>> > > > > D.
>> > > > >
>> > > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
>> > > [hidden email]
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Guys, I want to be clear:
>> > > > > > * "Per-field compression" design is the result of a research of
>> the
>> > > > > binary
>> > > > > > infrastructure of Ignite and some other its places (querying,
>> > > indexing,
>> > > > > > etc.)
>> > > > > > * Full-compression of object will be more effective, but in this
>> > case
>> > > > > there
>> > > > > > is no capability with querying and indexing (or there is large
>> > > overhead
>> > > > > by
>> > > > > > way of decompressing of full object (or caches pages) on demand)
>> > > > > > * "Per-field compression" is a one of ways to implement the
>> > > compression
>> > > > > > feature
>> > > > > >
>> > > > > > I'm new to Ignite also I can be mistaken in some things.
>> > > > > > Last 3-4 month I've tryed to start dicussion about a design, but
>> > > nobody
>> > > > > > answers nothing (except Dmitry and Valentin who was interested
>> how
>> > it
>> > > > > > works).
>> > > > > > But I understand that this is community and nobody is obliged to
>> > > > anybody.
>> > > > > >
>> > > > > > There are strong Ignite experts.
>> > > > > > If they can help me and community with a design of the
>> compression
>> > > > > feature
>> > > > > > it will be great.
>> > > > > > At the moment I have a desire and time to be engaged in
>> development
>> > > of
>> > > > > > compression feature in Ignite.
>> > > > > > Let's use this opportunity :)
>> > > > > >
>> > > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <
>> [hidden email]
>> > >:
>> > > > > >
>> > > > > > > Igniters,
>> > > > > > >
>> > > > > > > I have never seen a single Ignite user asking about
>> compressing a
>> > > > > single
>> > > > > > > field. However, we have had requests to secure certain fields,
>> > e.g.
>> > > > > > > passwords.
>> > > > > > >
>> > > > > > > I personally do not think per-field compression is needed,
>> unless
>> > > we
>> > > > > can
>> > > > > > > point out some concrete real life use cases.
>> > > > > > >
>> > > > > > > D.
>> > > > > > >
>> > > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
>> > > > > [hidden email]>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Anton,
>> > > > > > > >
>> > > > > > > > >> I thought that if there will storing compressed data in
>> the
>> > > > > memory,
>> > > > > > > data
>> > > > > > > > >> will transmit over wire in compression too. Is it right?
>> > > > > > > >
>> > > > > > > > In per-field compression case - yes.
>> > > > > > > >
>> > > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <
>> [hidden email]
>> > >:
>> > > > > > > >
>> > > > > > > > > Guys, could you please help me.
>> > > > > > > > > I thought that if there will storing compressed data in
>> the
>> > > > memory,
>> > > > > > > data
>> > > > > > > > > will transmit over wire in compression too. Is it right?
>> > > > > > > > >
>> > > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
>> > > > [hidden email]
>> > > > > >:
>> > > > > > > > >
>> > > > > > > > > > Vladimir,
>> > > > > > > > > >
>> > > > > > > > > > The main problem which I'am trying to solve is storing
>> data
>> > > in
>> > > > > > memory
>> > > > > > > > in
>> > > > > > > > > a
>> > > > > > > > > > compression form via Ignite.
>> > > > > > > > > > The main goal is using memory more effectivelly.
>> > > > > > > > > >
>> > > > > > > > > > >> here the much simpler step would be to full
>> > > > > > > > > > compression on per-cache basis rather than dealing with
>> > > > > per-fields
>> > > > > > > > case.
>> > > > > > > > > >
>> > > > > > > > > > Please explain your idea. Compess data by memory-page?
>> > > > > > > > > > Is it compatible with quering and indexing?
>> > > > > > > > > >
>> > > > > > > > > > >> In the end, if user would like to compress particular
>> > > field,
>> > > > > he
>> > > > > > > can
>> > > > > > > > > > always to it on his own
>> > > > > > > > > > I think we mustn't think in this way, if user need
>> > something
>> > > he
>> > > > > > > trying
>> > > > > > > > to
>> > > > > > > > > > choose a tool which has this feature OOTB.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
>> > > > [hidden email]
>> > > > > >:
>> > > > > > > > > >
>> > > > > > > > > > > Igniters,
>> > > > > > > > > > >
>> > > > > > > > > > > Honestly I still do not see how to apply it gracefully
>> > this
>> > > > > > feature
>> > > > > > > > ti
>> > > > > > > > > > > Ignite. And overall approach to compress only
>> particular
>> > > > fields
>> > > > > > > looks
>> > > > > > > > > > > overcomplicated to me. Remember, that our main use
>> case
>> > is
>> > > an
>> > > > > > > > > application
>> > > > > > > > > > > without classes on the server. It means that any kind
>> of
>> > > > > > > annotations
>> > > > > > > > > are
>> > > > > > > > > > > inapplicable. To be more precise: proper API should be
>> > > > > > implemented
>> > > > > > > to
>> > > > > > > > > > > handle no-class case (e.g. how would build such an
>> object
>> > > > > through
>> > > > > > > > > > > BinaryBuilder without a class?), and only then add
>> > > > annotations
>> > > > > as
>> > > > > > > > > > > convenient addition to more basic API.
>> > > > > > > > > > >
>> > > > > > > > > > > It seems to me that full implementation, which takes
>> in
>> > > count
>> > > > > > > proper
>> > > > > > > > > > > "classless" API, changes to binary metadata to reflect
>> > > > > compressed
>> > > > > > > > > fields,
>> > > > > > > > > > > changes to SQL, changes to binary protocol, and
>> porting
>> > to
>> > > > .NET
>> > > > > > and
>> > > > > > > > > CPP,
>> > > > > > > > > > > will yield very complex solution with little value to
>> the
>> > > > > > product.
>> > > > > > > > > > >
>> > > > > > > > > > > Instead, as I proposed earlier, it seems that we'd
>> better
>> > > > start
>> > > > > > > with
>> > > > > > > > > the
>> > > > > > > > > > > problem we are trying to solve. Basically, compression
>> > > could
>> > > > > help
>> > > > > > > in
>> > > > > > > > > two
>> > > > > > > > > > > cases:
>> > > > > > > > > > > 1) Transmitting data over wire - it should be
>> implemented
>> > > on
>> > > > > > > > > > communication
>> > > > > > > > > > > layer and should not affect binary serialization
>> > component
>> > > a
>> > > > > lot.
>> > > > > > > > > > > 2) Storing data in memory - here the much simpler step
>> > > would
>> > > > be
>> > > > > > to
>> > > > > > > > full
>> > > > > > > > > > > compression on per-cache basis rather than dealing
>> with
>> > > > > > per-fields
>> > > > > > > > > case.
>> > > > > > > > > > >
>> > > > > > > > > > > In the end, if user would like to compress particular
>> > > field,
>> > > > he
>> > > > > > can
>> > > > > > > > > > always
>> > > > > > > > > > > to it on his own, and set already compressed field to
>> our
>> > > > > > > > BinaryObject.
>> > > > > > > > > > >
>> > > > > > > > > > > Vladimir.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
>> > > > > > > > > [hidden email]
>> > > > > > > > > > >
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Valentin,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Yes, I have the prototype[1][2]
>> > > > > > > > > > > >
>> > > > > > > > > > > > You can see an example of Java class[3] that I used
>> in
>> > my
>> > > > > > > > benchmark.
>> > > > > > > > > > > > For example:
>> > > > > > > > > > > > class Foo {
>> > > > > > > > > > > > @BinaryCompression
>> > > > > > > > > > > > String data;
>> > > > > > > > > > > > }
>> > > > > > > > > > > > If user make decision to store the object in
>> compressed
>> > > > form,
>> > > > > > he
>> > > > > > > > can
>> > > > > > > > > > use
>> > > > > > > > > > > > the annotation @BinaryCompression as shown above.
>> > > > > > > > > > > > It means annotated field 'data' will be compressed
>> at
>> > > > > > > marshalling.
>> > > > > > > > > > > >
>> > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
>> > > > > > > > > > > > [2] https://issues.apache.org/jira
>> /browse/IGNITE-5226
>> > > > > > > > > > > > [3]
>> > > > > > > > > > > > https://github.com/daradurvs/i
>> gnite-compression/blob/
>> > > > > > > > > > > > master/src/main/java/ru/daradu
>> rvs/ignite/compression/
>> > > > > > > > > > model/Audit1F.java
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
>> > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > >:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Vyacheslav, Anton,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Are there any ideas and/or prototypes for the API?
>> > Your
>> > > > > > design
>> > > > > > > > > > > > suggestions
>> > > > > > > > > > > > > seem to make sense, but I would like to see how it
>> > all
>> > > > this
>> > > > > > > will
>> > > > > > > > > like
>> > > > > > > > > > > > from
>> > > > > > > > > > > > > user's standpoint.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > -Val
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
>> > > > > > > > [hidden email]
>> > > > > > > > > >
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Vyacheslav, correct me if something wrong
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > We could provide opportunity of choose between
>> CPU
>> > > > usage
>> > > > > > and
>> > > > > > > > > > MEM/NET
>> > > > > > > > > > > > > usage
>> > > > > > > > > > > > > > for users by compression some attributes of
>> stored
>> > > > > objects.
>> > > > > > > > > > > > > > You have learned design, and it is possible to
>> > > localize
>> > > > > > > changes
>> > > > > > > > > in
>> > > > > > > > > > > > > > marshalling without performance affect and
>> current
>> > > > > > > > functionality.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I think, that it's usefull for our project and
>> > users.
>> > > > > > > > > > > > > > Community, what do you think about this
>> proposal?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
>> > > > > > > > > [hidden email]
>> > > > > > > > > > >:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > In short,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > During marshalling a fields is represented as
>> > > > > > > > > BinaryFieldAccessor
>> > > > > > > > > > > > which
>> > > > > > > > > > > > > > > manages its marshalling. It checks if the
>> field
>> > is
>> > > > > marked
>> > > > > > > by
>> > > > > > > > > > > > annotation
>> > > > > > > > > > > > > > > @BinaryCompression, in that case - binary
>> > > > > representation
>> > > > > > > of
>> > > > > > > > > > field
>> > > > > > > > > > > > > (bytes
>> > > > > > > > > > > > > > > array) will be compressed. It will be marked
>> as
>> > > > > > compressed
>> > > > > > > by
>> > > > > > > > > > types
>> > > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
>> > after
>> > > > this
>> > > > > > the
>> > > > > > > > > > > > compressed
>> > > > > > > > > > > > > > > bytes
>> > > > > > > > > > > > > > > array wiil be include in binary
>> representation of
>> > > > whole
>> > > > > > > > object.
>> > > > > > > > > > > Note,
>> > > > > > > > > > > > > > > header of marshalled object will not be
>> > compressed.
>> > > > > > > > Compression
>> > > > > > > > > > > > > affected
>> > > > > > > > > > > > > > > only object's field representation.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Objects in IgniteCache is represented as
>> > > BinaryObject
>> > > > > > which
>> > > > > > > > is
>> > > > > > > > > > > > wrapper
>> > > > > > > > > > > > > > over
>> > > > > > > > > > > > > > > bytes array of marshalled object.
>> > > > > > > > > > > > > > > BinaryObject provides some usefull methods,
>> which
>> > > are
>> > > > > > used
>> > > > > > > by
>> > > > > > > > > > > Ignite
>> > > > > > > > > > > > > > > systems.
>> > > > > > > > > > > > > > > For example, the Queries use
>> BinaryObject#field
>> > > > method,
>> > > > > > > which
>> > > > > > > > > > > > > > deserializes
>> > > > > > > > > > > > > > > only field of object, without deserializing of
>> > > whole
>> > > > > > > object.
>> > > > > > > > > > > > > > > BinaryObject#field method during
>> deserialization,
>> > > if
>> > > > > > meets
>> > > > > > > > the
>> > > > > > > > > > > > constant
>> > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > compressed type, decompress this bytes array,
>> > then
>> > > > > > continue
>> > > > > > > > > > > > > unmarshalling
>> > > > > > > > > > > > > > > as usual.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Now, I introduced the Compressor interface in
>> > > > > > > > > > IgniteConfigurations,
>> > > > > > > > > > > > it
>> > > > > > > > > > > > > > > allows user to use own implementation of
>> > > compressor -
>> > > > > it
>> > > > > > is
>> > > > > > > > the
>> > > > > > > > > > > > > > requirement
>> > > > > > > > > > > > > > > in the task[1].
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't like
>> > the
>> > > > idea
>> > > > > > of
>> > > > > > > > > > granting
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > opportunity to the user.
>> > > > > > > > > > > > > > > In that case we can choose a compression
>> > algorithm
>> > > > > which
>> > > > > > we
>> > > > > > > > > will
>> > > > > > > > > > > > > provide
>> > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > default and will move the interface to
>> internals
>> > of
>> > > > > > binary
>> > > > > > > > > > > > > > infractructure.
>> > > > > > > > > > > > > > > For this case I've prepared benchmarked, which
>> > I've
>> > > > > sent
>> > > > > > > > > earlier.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides good
>> > > > > > compression
>> > > > > > > > > ratio
>> > > > > > > > > > > and
>> > > > > > > > > > > > > good
>> > > > > > > > > > > > > > > throughput. It has implementation in Java,
>> .NET
>> > and
>> > > > > C++,
>> > > > > > > and
>> > > > > > > > > has
>> > > > > > > > > > > > > > > ASF-friendly license, we can use it in the all
>> > > Ignite
>> > > > > > > > > platforms.
>> > > > > > > > > > > > > > > You can look at an assessment of this
>> algorithm
>> > in
>> > > my
>> > > > > > > > > benchmark's
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > [1] https://issues.apache.org/
>> > > > jira/browse/IGNITE-3592
>> > > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
>> > > > > > > > [hidden email]
>> > > > > > > > > >:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Looks good for me.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Could You propose design of implementation
>> in
>> > > > couple
>> > > > > of
>> > > > > > > > > > > sentences?
>> > > > > > > > > > > > > > > > So that we can estimate the completeness and
>> > > > > complexity
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > > > > proposal.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav
>> Daradur <
>> > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Anton,
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Of course, the solution does not affect on
>> > > > existing
>> > > > > > > > > > > > > implementation. I
>> > > > > > > > > > > > > > > > mean,
>> > > > > > > > > > > > > > > > > there is no changes if user not use the
>> > > > annotation
>> > > > > > > > > > > > > > @BinaryCompression.
>> > > > > > > > > > > > > > > > (no
>> > > > > > > > > > > > > > > > > performance changes)
>> > > > > > > > > > > > > > > > > Only if user make decision to use
>> compression
>> > > on
>> > > > > > > specific
>> > > > > > > > > > field
>> > > > > > > > > > > > or
>> > > > > > > > > > > > > > > fields
>> > > > > > > > > > > > > > > > > of a class - in that case compression
>> will be
>> > > > used
>> > > > > at
>> > > > > > > > > > > marshalling
>> > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > relation to annotated fields.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
>> > > > > > > > > > [hidden email]
>> > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Vyacheslav,
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Is it possible to propose implementation
>> > that
>> > > > can
>> > > > > > be
>> > > > > > > > > > switched
>> > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > on-demand?
>> > > > > > > > > > > > > > > > > > In this case it should not affect
>> > performance
>> > > > of
>> > > > > > > > current
>> > > > > > > > > > > > > solution.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I mean, that users should make decision
>> > what
>> > > is
>> > > > > > more
>> > > > > > > > > > > important
>> > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > them:
>> > > > > > > > > > > > > > > > > > throutput or memory/net usage.
>> > > > > > > > > > > > > > > > > > May be they will be choose not all
>> objects,
>> > > or
>> > > > > only
>> > > > > > > > some
>> > > > > > > > > > > > > attributes
>> > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > objects for compress.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
>> > > Daradur <
>> > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Conclusion:
>> > > > > > > > > > > > > > > > > > > Provided solution allows reduce size
>> of
>> > an
>> > > > > object
>> > > > > > > in
>> > > > > > > > > > > > > IgniteCache
>> > > > > > > > > > > > > > at
>> > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > cost of throughput reduction (small -
>> in
>> > > some
>> > > > > > > cases),
>> > > > > > > > > it
>> > > > > > > > > > > > > depends
>> > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > part
>> > > > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > object which will be compressed and
>> > > > compression
>> > > > > > > > > > algorithm.
>> > > > > > > > > > > > > > > > > > > I mean, we can make more effective
>> use of
>> > > > > memory,
>> > > > > > > and
>> > > > > > > > > in
>> > > > > > > > > > > some
>> > > > > > > > > > > > > > cases
>> > > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
>> > > > > (replication,
>> > > > > > > > > > > > rebalancing)
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Especially, it will be particularly
>> > useful
>> > > > for
>> > > > > > > > object's
>> > > > > > > > > > > > fields
>> > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
>> > > > > effectively
>> > > > > > > > > > > compressed.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон
>> Чураев <
>> > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could you
>> > > please
>> > > > > > > > provide a
>> > > > > > > > > > > > > > conclusions
>> > > > > > > > > > > > > > > > or
>> > > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00
>> Vyacheslav
>> > > > > Daradur <
>> > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Dmitry,
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Excel-pages:
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" -
>> shows
>> > > > object
>> > > > > > > size,
>> > > > > > > > > with
>> > > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
>> > > literal
>> > > > > > text)
>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>> ratios of
>> > > > using
>> > > > > > > > > different
>> > > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > > > > algrithms
>> > > > > > > > > > > > > > > > > > > > > depending on size of compressed
>> > field.
>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of
>> size of
>> > > > > objects
>> > > > > > > > > > depending
>> > > > > > > > > > > > on
>> > > > > > > > > > > > > > > sizes
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" -
>> shows
>> > > > object
>> > > > > > > size,
>> > > > > > > > > with
>> > > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
>> > > badly
>> > > > > > > > compressed
>> > > > > > > > > > > > > character
>> > > > > > > > > > > > > > > > > > sequence)
>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>> ratios of
>> > > > using
>> > > > > > > > > different
>> > > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > > > > > algrithms depending on size of
>> > > compressed
>> > > > > > > field.
>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of
>> size of
>> > > > > objects
>> > > > > > > > > > depending
>> > > > > > > > > > > > on
>> > > > > > > > > > > > > > > sizes
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average time
>> of
>> > > the
>> > > > > > "put"
>> > > > > > > > > > > operation
>> > > > > > > > > > > > > > > > depending
>> > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > size
>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows throughput
>> of
>> > > the
>> > > > > > "put"
>> > > > > > > > > > > operation
>> > > > > > > > > > > > > > > > depending
>> > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > size
>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average time
>> of
>> > > the
>> > > > > > "get"
>> > > > > > > > > > > operation
>> > > > > > > > > > > > > > > > depending
>> > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > size
>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows throughput
>> of
>> > > the
>> > > > > > "get"
>> > > > > > > > > > > operation
>> > > > > > > > > > > > > > > > depending
>> > > > > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > > > > size
>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00 Dmitriy
>> > > > > Setrakyan
>> > > > > > <
>> > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
>> > > > interpret
>> > > > > > the
>> > > > > > > > > > graphs?
>> > > > > > > > > > > > What
>> > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > > > > looking
>> > > > > > > > > > > > > > > > > > > > > > at?
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33 AM,
>> > > > > Vyacheslav
>> > > > > > > > > > Daradur <
>> > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > I've prepared some
>> benchmarking.
>> > > > > Results
>> > > > > > > [1].
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > And I've prepared the
>> evaluation
>> > in
>> > > > the
>> > > > > > > form
>> > > > > > > > of
>> > > > > > > > > > > > > diagrams
>> > > > > > > > > > > > > > > [2].
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > I hope that helps to interest
>> the
>> > > > > > community
>> > > > > > > > and
>> > > > > > > > > > > > > > > accelerates a
>> > > > > > > > > > > > > > > > > > > > reaction
>> > > > > > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > > > this improvment :)
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > [1]
>> > > > > > > > > > > > > > > > > > > > > > > https://github.com/daradurvs/
>> > > > > > > > > > > > ignite-compression/tree/
>> > > > > > > > > > > > > > > > > > > > > > > master/src/main/resources/resu
>> lt
>> > > > > > > > > > > > > > > > > > > > > > > [2]
>> > https://drive.google.com/file/
>> > > d/
>> > > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
>> > > > > > > > > > > > > > > > > > > > view
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
>> > > Vyacheslav
>> > > > > > > Daradur
>> > > > > > > > <
>> > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
>> > > > Vyacheslav
>> > > > > > > > > Daradur <
>> > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to
>> show
>> > my
>> > > > > idea.
>> > > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
>> > > > > > > > > > > ignite/pull/1951/files
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
>> > > copied
>> > > > > > > existing
>> > > > > > > > > > tests
>> > > > > > > > > > > > and
>> > > > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > > > > > annotated
>> > > > > > > > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > > > > > >> testing data.
>> > > > > > > > > > > > > > > > > > > > > > > >> https://github.com/apache/
>> > > > > > > > > > > > > > ignite/pull/1951/files#diff-
>> > > > > > > > > > > > > > > > > c19a9d
>> > > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> It means fields which will
>> be
>> > > > marked
>> > > > > > by
>> > > > > > > > > > > > > > > @BinaryCompression
>> > > > > > > > > > > > > > > > > > will
>> > > > > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling
>> via
>> > > > > > > > > > BinaryMarshaller.
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> This solution has no
>> effect on
>> > > > > > existing
>> > > > > > > > data
>> > > > > > > > > > or
>> > > > > > > > > > > > > > project
>> > > > > > > > > > > > > > > > > > > > > architecture.
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
>> > > thougths.
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
>> > > > > Vyacheslav
>> > > > > > > > > Daradur
>> > > > > > > > > > <
>> > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
>> > want
>> > > to
>> > > > > > show
>> > > > > > > > it.
>> > > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
>> > discuss
>> > > on
>> > > > > > > > example.
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02 GMT+03:00
>> > > > Dmitriy
>> > > > > > > > > Setrakyan
>> > > > > > > > > > <
>> > > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
>> > premature
>> > > to
>> > > > > > > > provide a
>> > > > > > > > > > PR
>> > > > > > > > > > > > > > without
>> > > > > > > > > > > > > > > > > > getting
>> > > > > > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > > > > > > > >>>> community
>> > > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev
>> list.
>> > > > Please
>> > > > > > > allow
>> > > > > > > > > some
>> > > > > > > > > > > > time
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > > > community
>> > > > > > > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > > > >>>> respond.
>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> D.
>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at
>> 6:36
>> > > AM,
>> > > > > > > > > Vyacheslav
>> > > > > > > > > > > > > Daradur
>> > > > > > > > > > > > > > <
>> > > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
>> > > > > > > > > > > > > > > https://issues.apache.org/jira
>> > > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
>> > > > described
>> > > > > > > > > solution
>> > > > > > > > > > in
>> > > > > > > > > > > > > > couple
>> > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > days.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05
>> GMT+03:00
>> > > > > > > Vyacheslav
>> > > > > > > > > > > Daradur
>> > > > > > > > > > > > <
>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > > > >:
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is
>> released.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
>> > > > discussion
>> > > > > > > about
>> > > > > > > > a
>> > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > > design.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I
>> found
>> > > only
>> > > > > one
>> > > > > > > > > solution
>> > > > > > > > > > > > which
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > > compatible
>> > > > > > > > > > > > > > > > > > > > > > with
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > querying
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this is
>> > > > > > > > > per-objects-field
>> > > > > > > > > > > > > > > compression.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields
>> compression
>> > > means
>> > > > > > that
>> > > > > > > > > > metadata
>> > > > > > > > > > > > (a
>> > > > > > > > > > > > > > > > header)
>> > > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > an
>> > > > > > > > > > > > > > > > > > > > > > object
>> > > > > > > > > > > > > > > > > > > > > > > >>>> won't
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
>> > > > serialized
>> > > > > > > > values
>> > > > > > > > > of
>> > > > > > > > > > > an
>> > > > > > > > > > > > > > object
>> > > > > > > > > > > > > > > > > > fields
>> > > > > > > > > > > > > > > > > > > > (in
>> > > > > > > > > > > > > > > > > > > > > > > bytes
>> > > > > > > > > > > > > > > > > > > > > > > >>>> array
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
>> > compressed.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have
>> some
>> > > > > > > contentious
>> > > > > > > > > > > issues:
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
>> > > > > primitives
>> > > > > > > and
>> > > > > > > > > > short
>> > > > > > > > > > > > > > arrays -
>> > > > > > > > > > > > > > > > > there
>> > > > > > > > > > > > > > > > > > > > isn't
>> > > > > > > > > > > > > > > > > > > > > > > >>>> sense to
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no
>> possible
>> > to
>> > > > use
>> > > > > > > > > > compression
>> > > > > > > > > > > > with
>> > > > > > > > > > > > > > > > > > > > java-predefined
>> > > > > > > > > > > > > > > > > > > > > > > >>>> types;
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
>> > > > annotation,
>> > > > > > > > > > > > > > @IgniteCompression -
>> > > > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > > > > example,
>> > > > > > > > > > > > > > > > > > > > > > > >>>> which can
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
>> > > marking
>> > > > > > > fields
>> > > > > > > > to
>> > > > > > > > > > > > > compress.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone already
>> > have
>> > > > > ready
>> > > > > > > > > design?
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
>> > GMT+03:00
>> > > > > > > > Vyacheslav
>> > > > > > > > > > > > Daradur
>> > > > > > > > > > > > > <
>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss
>> about
>> > > > > public
>> > > > > > > API
>> > > > > > > > > > > design.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to
>> add
>> > > > some a
>> > > > > > > > > configure
>> > > > > > > > > > > > > entity
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain
>> the
>> > > > > > Compressor
>> > > > > > > > > > > interface
>> > > > > > > > > > > > > > > > > > > implementation
>> > > > > > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > > > > some
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to provide
>> a
>> > > > > > > > > BinaryMarshaller
>> > > > > > > > > > > > > > decorator,
>> > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > will
>> > > > > > > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > > > > > > >>>> compress
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after
>> marshalling.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
>> > > GMT+03:00
>> > > > > > Alexey
>> > > > > > > > > > > > Kuznetsov <
>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read
>> initial
>> > > > > > discussion
>> > > > > > > > [1]
>> > > > > > > > > > > about
>> > > > > > > > > > > > > > > > > compression?
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I
>> remember
>> > we
>> > > > > agreed
>> > > > > > > to
>> > > > > > > > > add
>> > > > > > > > > > > only
>> > > > > > > > > > > > > > some
>> > > > > > > > > > > > > > > > > > > > "top-level"
>> > > > > > > > > > > > > > > > > > > > > > API
>> > > > > > > > > > > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > order
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to
>> inject
>> > > > some
>> > > > > > sort
>> > > > > > > > of
>> > > > > > > > > > > custom
>> > > > > > > > > > > > > > > > > > compression.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > http://apache-ignite-developer
>> > > > > > > > > > > > > > > s.2346864.n4.nabble
>> > > > > > > > > > > > > > > > .
>> > > > > > > > > > > > > > > > > > > > > com/Data-c
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > ompression-in-Ignite-2-0-
>> > > > > > > > > td10099.html
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10,
>> 2017
>> > at
>> > > > 2:19
>> > > > > > PM,
>> > > > > > > > > > > > daradurvs <
>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested
>> in
>> > > this
>> > > > > > task.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some
>> kind of
>> > > > > > pluggable
>> > > > > > > > > > > > compression
>> > > > > > > > > > > > > > SPI
>> > > > > > > > > > > > > > > > > > support
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
>> > > > > https://issues.apache.org/
>> > > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
>> > solution
>> > > > on
>> > > > > > > > > > > > > > > > BinaryMarshaller-level,
>> > > > > > > > > > > > > > > > > > but
>> > > > > > > > > > > > > > > > > > > > > > reviewer
>> > > > > > > > > > > > > > > > > > > > > > > >>>> has
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
>> > > > discussion
>> > > > > of
>> > > > > > > > task
>> > > > > > > > > > > goals
>> > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > solution
>> > > > > > > > > > > > > > > > > > > > > design.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
>> > that,
>> > > > the
>> > > > > > main
>> > > > > > > > > goal
>> > > > > > > > > > of
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > task
>> > > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > > > > store
>> > > > > > > > > > > > > > > > > > > > > > > >>>> data in
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I
>> need
>> > > from
>> > > > > > > Ignite
>> > > > > > > > as
>> > > > > > > > > > its
>> > > > > > > > > > > > > user.
>> > > > > > > > > > > > > > > > > > > Compression
>> > > > > > > > > > > > > > > > > > > > > > > >>>> provides
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store more
>> > data
>> > > > on
>> > > > > > same
>> > > > > > > > > > servers
>> > > > > > > > > > > > at
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > cost
>> > > > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching a
>> > > > > > possibility
>> > > > > > > of
>> > > > > > > > > > > > > > > implementation
>> > > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > > > > > > compression
>> > > > > > > > > > > > > > > > > > > > > > > >>>> at the
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this
>> message in
>> > > > > > context:
>> > > > > > > > > > > > > > > > > http://apache-ignite-
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > developers.2346864.n4.nabble.
>> > > > > > > > > > > > > > > > > > com/Data-compression-in-
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > Ignite-2-0-tp10099p16317.html
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the
>> Apache
>> > > > > Ignite
>> > > > > > > > > > Developers
>> > > > > > > > > > > > > > mailing
>> > > > > > > > > > > > > > > > > list
>> > > > > > > > > > > > > > > > > > > > > archive
>> > > > > > > > > > > > > > > > > > > > > > at
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
>> > Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Best Regards,
>> Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > --
>> > > > > > > > > > > > > > > > > > > > > > > >>>> > Best Regards,
>> Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>> --
>> > > > > > > > > > > > > > > > > > > > > > > >>> Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>>
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >> --
>> > > > > > > > > > > > > > > > > > > > > > > >> Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >>
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Best Regards, Anton Churaev
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Best Regards, Anton Churaev
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > --
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Best Regards, Anton Churaev
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > --
>> > > > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > Best Regards, Vyacheslav
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > >
>> > > > > > > > > Best Regards, Anton Churaev
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Best Regards, Vyacheslav
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Best Regards, Vyacheslav
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > Best Regards, Vyacheslav
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Best Regards, Anton Churaev
>> >
>>
>>
>>
>> --
>> Sergey Kozlov
>> GridGain Systems
>> www.gridgain.com
>>
>
>
>
> --
> Best Regards, Vyacheslav
>



--
Best Regards, Vyacheslav D.
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Vladimir Ozerov
Vyacheslav,

This is not about my needs, but about the product :-) BinaryObject is a
central entity used for both data transfer and data storage. This is both
good and bad at the same time.

Good thing is that as we optimize binary protocol, we improve both network
and storage performance at the same time. We have at least 3 things which
will be included into the product soon: varint encoding [1], optimized
string encoding [2] and null-field optimization [3]. Bad thing is that
binary object format is not well suited for data storage optimizations,
including compression. For example, one good compression technique is to
organize data in column-store format, or to introduce shared "dictionary"
with unique values on cache level. In both cases N equal values are not
stored N times. Instead, we store one value and N references to it, or so.
This way 2x-10x compression is possible depending on workload type. Binary
object protocol with some compression on top of it cannot give such
improvement, because it will compress data in individual objects, instead
of compressing the whole cache data in a single context.

That said, I propose to give up adding compression to BinaryObject. This is
a dead end. Instead, we should:
1) Optimize protocol itself to be more compact, as described in
aforementioned Ignite tickets
2) Start new discussion about storage compression

You can read papers of other vendors to get better understanding on
possible compression options. E.g. Oracle has a lot of compression
techniques, including heat maps, background compression, per-block
compression, data dictionaries, etc. [4].

[1] https://issues.apache.org/jira/browse/IGNITE-5097
[2] https://issues.apache.org/jira/browse/IGNITE-5655
[3] https://issues.apache.org/jira/browse/IGNITE-3939
[4] http://www.oracle.com/technetwork/database/options/compression/advanced-
compression-wp-12c-1896128.pdf

Vladimir.


On Tue, Jul 11, 2017 at 6:56 PM, Vyacheslav Daradur <[hidden email]>
wrote:

> Hi Igniters!
>
> I'd like to continue developing and discussing about compression in Ignite.
>
> Vladimir, could you propose a design of compression feature in Ignite,
> that suits you?
>
> 2017-06-15 16:13 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
>
>> Hi Igniters.
>>
>> Vladimir, I want to propose another design of an implementation of the
>> per-field compression.
>>
>> 1) We will add new step in the method prepareForCache (for example) of
>> CacheObject, or in GridCacheMapEntry.
>>
>> At the step, after marshalling of an object, we will compress fields of
>> the object which described in advance.
>> User will describe class fields which he wants to compess in an another
>> entity like Metadata.
>>
>> For compression, we will introduce another entity, for example
>> CompressionProcessor, which will work with bytes array (marshalled object).
>> The entity will read bytes array of described fields, compress it and
>> rewrite binary representation of the whole object.
>> After processing the object will be put in the cache.
>>
>> In this case design not to relate to binary infrastructure.
>> But there is big overhead to heap-memory for the buffer.
>>
>> 2) Another solution is to compress bytes array of whole object on copying
>> to off-heap.
>> But, in this case I don't understand yet, how to provide support of
>> querying and indexing.
>>
>>
>> 2017-06-09 11:21 GMT+03:00 Sergey Kozlov <[hidden email]>:
>>
>>> Hi
>>>
>>> * "Per-field compression" is applicable for huge BLOB fields and will
>>> impose the restrictions like unable ot index such fields, slower getting
>>> data, potential OOM issues if compression ration is too high.
>>> But for some cases it makes sense
>>>
>>> On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев <[hidden email]>
>>> wrote:
>>>
>>> > Seems that Dmitry is referring to transparent data encryption. It is
>>> used
>>> > throughout the whale database industry.
>>> >
>>> > 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>>> >
>>> > > Dima,
>>> > >
>>> > > Encryption of certain fields is as bad as compression. First, it is a
>>> > huge
>>> > > change, which makes already complex binary protocol even more
>>> complex.
>>> > > Second, it have to be ported to CPP, .NET platforms, as well as to
>>> JDBC
>>> > and
>>> > > ODBC.
>>> > > Last, but the most important - this is not our headache to encrypt
>>> > > sensitive data. This is user responsibility. Nobody in a sane mind
>>> will
>>> > > store passwords in plain form. Instead, user should encrypt it on his
>>> > own,
>>> > > choosing proper encryption parameters - algorithms, key lengths,
>>> salts,
>>> > > etc.. How are you going to expose this in API or configuration?
>>> > >
>>> > > We should not implement data encryption on binary level, this is out
>>> of
>>> > > question. Encryption should be implemented on application level (user
>>> > > efforts), transport layer (SSL - we already have it), and possibly on
>>> > > disk-level (there are tools for this already).
>>> > >
>>> > >
>>> > > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <
>>> [hidden email]>
>>> > > wrote:
>>> > >
>>> > > > >> which is much less useful.
>>> > > > I note, in some cases there is profit more than twice per size of
>>> an
>>> > > > object.
>>> > > >
>>> > > > >> Would it be possible to change your implementation to handle the
>>> > > > encryption instead?
>>> > > > Yes, of cource, there's not much difference between compression and
>>> > > > encryption, including in my implementation of
>>> per-field-compression.
>>> > > >
>>> > > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan <[hidden email]
>>> >:
>>> > > >
>>> > > > > Vyacheslav,
>>> > > > >
>>> > > > > When this feature started out as data compression in Ignite, it
>>> > sounded
>>> > > > > very useful. Now it is unfolding as a per-field compression,
>>> which is
>>> > > > much
>>> > > > > less useful. In fact, it is questionable whether it is useful at
>>> all.
>>> > > The
>>> > > > > fact that this feature is implemented does not make it mandatory
>>> for
>>> > > the
>>> > > > > community to accept it.
>>> > > > >
>>> > > > > However, as I mentioned before, per-field encryption is very
>>> useful,
>>> > as
>>> > > > it
>>> > > > > would allow users automatically encrypt certain sensitive fields,
>>> > like
>>> > > > > passwords, credit card numbers, etc. There is not much conceptual
>>> > > > > difference between compressing a field vs encrypting a field.
>>> Would
>>> > it
>>> > > be
>>> > > > > possible to change your implementation to handle the encryption
>>> > > instead?
>>> > > > >
>>> > > > > D.
>>> > > > >
>>> > > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
>>> > > [hidden email]
>>> > > > >
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Guys, I want to be clear:
>>> > > > > > * "Per-field compression" design is the result of a research
>>> of the
>>> > > > > binary
>>> > > > > > infrastructure of Ignite and some other its places (querying,
>>> > > indexing,
>>> > > > > > etc.)
>>> > > > > > * Full-compression of object will be more effective, but in
>>> this
>>> > case
>>> > > > > there
>>> > > > > > is no capability with querying and indexing (or there is large
>>> > > overhead
>>> > > > > by
>>> > > > > > way of decompressing of full object (or caches pages) on
>>> demand)
>>> > > > > > * "Per-field compression" is a one of ways to implement the
>>> > > compression
>>> > > > > > feature
>>> > > > > >
>>> > > > > > I'm new to Ignite also I can be mistaken in some things.
>>> > > > > > Last 3-4 month I've tryed to start dicussion about a design,
>>> but
>>> > > nobody
>>> > > > > > answers nothing (except Dmitry and Valentin who was interested
>>> how
>>> > it
>>> > > > > > works).
>>> > > > > > But I understand that this is community and nobody is obliged
>>> to
>>> > > > anybody.
>>> > > > > >
>>> > > > > > There are strong Ignite experts.
>>> > > > > > If they can help me and community with a design of the
>>> compression
>>> > > > > feature
>>> > > > > > it will be great.
>>> > > > > > At the moment I have a desire and time to be engaged in
>>> development
>>> > > of
>>> > > > > > compression feature in Ignite.
>>> > > > > > Let's use this opportunity :)
>>> > > > > >
>>> > > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <
>>> [hidden email]
>>> > >:
>>> > > > > >
>>> > > > > > > Igniters,
>>> > > > > > >
>>> > > > > > > I have never seen a single Ignite user asking about
>>> compressing a
>>> > > > > single
>>> > > > > > > field. However, we have had requests to secure certain
>>> fields,
>>> > e.g.
>>> > > > > > > passwords.
>>> > > > > > >
>>> > > > > > > I personally do not think per-field compression is needed,
>>> unless
>>> > > we
>>> > > > > can
>>> > > > > > > point out some concrete real life use cases.
>>> > > > > > >
>>> > > > > > > D.
>>> > > > > > >
>>> > > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
>>> > > > > [hidden email]>
>>> > > > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Anton,
>>> > > > > > > >
>>> > > > > > > > >> I thought that if there will storing compressed data in
>>> the
>>> > > > > memory,
>>> > > > > > > data
>>> > > > > > > > >> will transmit over wire in compression too. Is it right?
>>> > > > > > > >
>>> > > > > > > > In per-field compression case - yes.
>>> > > > > > > >
>>> > > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <
>>> [hidden email]
>>> > >:
>>> > > > > > > >
>>> > > > > > > > > Guys, could you please help me.
>>> > > > > > > > > I thought that if there will storing compressed data in
>>> the
>>> > > > memory,
>>> > > > > > > data
>>> > > > > > > > > will transmit over wire in compression too. Is it right?
>>> > > > > > > > >
>>> > > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
>>> > > > [hidden email]
>>> > > > > >:
>>> > > > > > > > >
>>> > > > > > > > > > Vladimir,
>>> > > > > > > > > >
>>> > > > > > > > > > The main problem which I'am trying to solve is storing
>>> data
>>> > > in
>>> > > > > > memory
>>> > > > > > > > in
>>> > > > > > > > > a
>>> > > > > > > > > > compression form via Ignite.
>>> > > > > > > > > > The main goal is using memory more effectivelly.
>>> > > > > > > > > >
>>> > > > > > > > > > >> here the much simpler step would be to full
>>> > > > > > > > > > compression on per-cache basis rather than dealing with
>>> > > > > per-fields
>>> > > > > > > > case.
>>> > > > > > > > > >
>>> > > > > > > > > > Please explain your idea. Compess data by memory-page?
>>> > > > > > > > > > Is it compatible with quering and indexing?
>>> > > > > > > > > >
>>> > > > > > > > > > >> In the end, if user would like to compress
>>> particular
>>> > > field,
>>> > > > > he
>>> > > > > > > can
>>> > > > > > > > > > always to it on his own
>>> > > > > > > > > > I think we mustn't think in this way, if user need
>>> > something
>>> > > he
>>> > > > > > > trying
>>> > > > > > > > to
>>> > > > > > > > > > choose a tool which has this feature OOTB.
>>> > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
>>> > > > [hidden email]
>>> > > > > >:
>>> > > > > > > > > >
>>> > > > > > > > > > > Igniters,
>>> > > > > > > > > > >
>>> > > > > > > > > > > Honestly I still do not see how to apply it
>>> gracefully
>>> > this
>>> > > > > > feature
>>> > > > > > > > ti
>>> > > > > > > > > > > Ignite. And overall approach to compress only
>>> particular
>>> > > > fields
>>> > > > > > > looks
>>> > > > > > > > > > > overcomplicated to me. Remember, that our main use
>>> case
>>> > is
>>> > > an
>>> > > > > > > > > application
>>> > > > > > > > > > > without classes on the server. It means that any
>>> kind of
>>> > > > > > > annotations
>>> > > > > > > > > are
>>> > > > > > > > > > > inapplicable. To be more precise: proper API should
>>> be
>>> > > > > > implemented
>>> > > > > > > to
>>> > > > > > > > > > > handle no-class case (e.g. how would build such an
>>> object
>>> > > > > through
>>> > > > > > > > > > > BinaryBuilder without a class?), and only then add
>>> > > > annotations
>>> > > > > as
>>> > > > > > > > > > > convenient addition to more basic API.
>>> > > > > > > > > > >
>>> > > > > > > > > > > It seems to me that full implementation, which takes
>>> in
>>> > > count
>>> > > > > > > proper
>>> > > > > > > > > > > "classless" API, changes to binary metadata to
>>> reflect
>>> > > > > compressed
>>> > > > > > > > > fields,
>>> > > > > > > > > > > changes to SQL, changes to binary protocol, and
>>> porting
>>> > to
>>> > > > .NET
>>> > > > > > and
>>> > > > > > > > > CPP,
>>> > > > > > > > > > > will yield very complex solution with little value
>>> to the
>>> > > > > > product.
>>> > > > > > > > > > >
>>> > > > > > > > > > > Instead, as I proposed earlier, it seems that we'd
>>> better
>>> > > > start
>>> > > > > > > with
>>> > > > > > > > > the
>>> > > > > > > > > > > problem we are trying to solve. Basically,
>>> compression
>>> > > could
>>> > > > > help
>>> > > > > > > in
>>> > > > > > > > > two
>>> > > > > > > > > > > cases:
>>> > > > > > > > > > > 1) Transmitting data over wire - it should be
>>> implemented
>>> > > on
>>> > > > > > > > > > communication
>>> > > > > > > > > > > layer and should not affect binary serialization
>>> > component
>>> > > a
>>> > > > > lot.
>>> > > > > > > > > > > 2) Storing data in memory - here the much simpler
>>> step
>>> > > would
>>> > > > be
>>> > > > > > to
>>> > > > > > > > full
>>> > > > > > > > > > > compression on per-cache basis rather than dealing
>>> with
>>> > > > > > per-fields
>>> > > > > > > > > case.
>>> > > > > > > > > > >
>>> > > > > > > > > > > In the end, if user would like to compress particular
>>> > > field,
>>> > > > he
>>> > > > > > can
>>> > > > > > > > > > always
>>> > > > > > > > > > > to it on his own, and set already compressed field
>>> to our
>>> > > > > > > > BinaryObject.
>>> > > > > > > > > > >
>>> > > > > > > > > > > Vladimir.
>>> > > > > > > > > > >
>>> > > > > > > > > > >
>>> > > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav Daradur <
>>> > > > > > > > > [hidden email]
>>> > > > > > > > > > >
>>> > > > > > > > > > > wrote:
>>> > > > > > > > > > >
>>> > > > > > > > > > > > Valentin,
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > Yes, I have the prototype[1][2]
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > You can see an example of Java class[3] that I
>>> used in
>>> > my
>>> > > > > > > > benchmark.
>>> > > > > > > > > > > > For example:
>>> > > > > > > > > > > > class Foo {
>>> > > > > > > > > > > > @BinaryCompression
>>> > > > > > > > > > > > String data;
>>> > > > > > > > > > > > }
>>> > > > > > > > > > > > If user make decision to store the object in
>>> compressed
>>> > > > form,
>>> > > > > > he
>>> > > > > > > > can
>>> > > > > > > > > > use
>>> > > > > > > > > > > > the annotation @BinaryCompression as shown above.
>>> > > > > > > > > > > > It means annotated field 'data' will be compressed
>>> at
>>> > > > > > > marshalling.
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/1951
>>> > > > > > > > > > > > [2] https://issues.apache.org/jira
>>> /browse/IGNITE-5226
>>> > > > > > > > > > > > [3]
>>> > > > > > > > > > > > https://github.com/daradurvs/i
>>> gnite-compression/blob/
>>> > > > > > > > > > > > master/src/main/java/ru/daradu
>>> rvs/ignite/compression/
>>> > > > > > > > > > model/Audit1F.java
>>> > > > > > > > > > > >
>>> > > > > > > > > > > >
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko <
>>> > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > >:
>>> > > > > > > > > > > >
>>> > > > > > > > > > > > > Vyacheslav, Anton,
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > Are there any ideas and/or prototypes for the
>>> API?
>>> > Your
>>> > > > > > design
>>> > > > > > > > > > > > suggestions
>>> > > > > > > > > > > > > seem to make sense, but I would like to see how
>>> it
>>> > all
>>> > > > this
>>> > > > > > > will
>>> > > > > > > > > like
>>> > > > > > > > > > > > from
>>> > > > > > > > > > > > > user's standpoint.
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > -Val
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон Чураев <
>>> > > > > > > > [hidden email]
>>> > > > > > > > > >
>>> > > > > > > > > > > > wrote:
>>> > > > > > > > > > > > >
>>> > > > > > > > > > > > > > Vyacheslav, correct me if something wrong
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > We could provide opportunity of choose between
>>> CPU
>>> > > > usage
>>> > > > > > and
>>> > > > > > > > > > MEM/NET
>>> > > > > > > > > > > > > usage
>>> > > > > > > > > > > > > > for users by compression some attributes of
>>> stored
>>> > > > > objects.
>>> > > > > > > > > > > > > > You have learned design, and it is possible to
>>> > > localize
>>> > > > > > > changes
>>> > > > > > > > > in
>>> > > > > > > > > > > > > > marshalling without performance affect and
>>> current
>>> > > > > > > > functionality.
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > I think, that it's usefull for our project and
>>> > users.
>>> > > > > > > > > > > > > > Community, what do you think about this
>>> proposal?
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav Daradur <
>>> > > > > > > > > [hidden email]
>>> > > > > > > > > > >:
>>> > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > In short,
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > During marshalling a fields is represented as
>>> > > > > > > > > BinaryFieldAccessor
>>> > > > > > > > > > > > which
>>> > > > > > > > > > > > > > > manages its marshalling. It checks if the
>>> field
>>> > is
>>> > > > > marked
>>> > > > > > > by
>>> > > > > > > > > > > > annotation
>>> > > > > > > > > > > > > > > @BinaryCompression, in that case - binary
>>> > > > > representation
>>> > > > > > > of
>>> > > > > > > > > > field
>>> > > > > > > > > > > > > (bytes
>>> > > > > > > > > > > > > > > array) will be compressed. It will be marked
>>> as
>>> > > > > > compressed
>>> > > > > > > by
>>> > > > > > > > > > types
>>> > > > > > > > > > > > > > > constant (GridBinaryMarshaller.COMPRESSED),
>>> > after
>>> > > > this
>>> > > > > > the
>>> > > > > > > > > > > > compressed
>>> > > > > > > > > > > > > > > bytes
>>> > > > > > > > > > > > > > > array wiil be include in binary
>>> representation of
>>> > > > whole
>>> > > > > > > > object.
>>> > > > > > > > > > > Note,
>>> > > > > > > > > > > > > > > header of marshalled object will not be
>>> > compressed.
>>> > > > > > > > Compression
>>> > > > > > > > > > > > > affected
>>> > > > > > > > > > > > > > > only object's field representation.
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > Objects in IgniteCache is represented as
>>> > > BinaryObject
>>> > > > > > which
>>> > > > > > > > is
>>> > > > > > > > > > > > wrapper
>>> > > > > > > > > > > > > > over
>>> > > > > > > > > > > > > > > bytes array of marshalled object.
>>> > > > > > > > > > > > > > > BinaryObject provides some usefull methods,
>>> which
>>> > > are
>>> > > > > > used
>>> > > > > > > by
>>> > > > > > > > > > > Ignite
>>> > > > > > > > > > > > > > > systems.
>>> > > > > > > > > > > > > > > For example, the Queries use
>>> BinaryObject#field
>>> > > > method,
>>> > > > > > > which
>>> > > > > > > > > > > > > > deserializes
>>> > > > > > > > > > > > > > > only field of object, without deserializing
>>> of
>>> > > whole
>>> > > > > > > object.
>>> > > > > > > > > > > > > > > BinaryObject#field method during
>>> deserialization,
>>> > > if
>>> > > > > > meets
>>> > > > > > > > the
>>> > > > > > > > > > > > constant
>>> > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > compressed type, decompress this bytes array,
>>> > then
>>> > > > > > continue
>>> > > > > > > > > > > > > unmarshalling
>>> > > > > > > > > > > > > > > as usual.
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > Now, I introduced the Compressor interface in
>>> > > > > > > > > > IgniteConfigurations,
>>> > > > > > > > > > > > it
>>> > > > > > > > > > > > > > > allows user to use own implementation of
>>> > > compressor -
>>> > > > > it
>>> > > > > > is
>>> > > > > > > > the
>>> > > > > > > > > > > > > > requirement
>>> > > > > > > > > > > > > > > in the task[1].
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > As far as I know, Vladimir Ozerov doesn't
>>> like
>>> > the
>>> > > > idea
>>> > > > > > of
>>> > > > > > > > > > granting
>>> > > > > > > > > > > > > this
>>> > > > > > > > > > > > > > > opportunity to the user.
>>> > > > > > > > > > > > > > > In that case we can choose a compression
>>> > algorithm
>>> > > > > which
>>> > > > > > we
>>> > > > > > > > > will
>>> > > > > > > > > > > > > provide
>>> > > > > > > > > > > > > > by
>>> > > > > > > > > > > > > > > default and will move the interface to
>>> internals
>>> > of
>>> > > > > > binary
>>> > > > > > > > > > > > > > infractructure.
>>> > > > > > > > > > > > > > > For this case I've prepared benchmarked,
>>> which
>>> > I've
>>> > > > > sent
>>> > > > > > > > > earlier.
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it provides
>>> good
>>> > > > > > compression
>>> > > > > > > > > ratio
>>> > > > > > > > > > > and
>>> > > > > > > > > > > > > good
>>> > > > > > > > > > > > > > > throughput. It has implementation in Java,
>>> .NET
>>> > and
>>> > > > > C++,
>>> > > > > > > and
>>> > > > > > > > > has
>>> > > > > > > > > > > > > > > ASF-friendly license, we can use it in the
>>> all
>>> > > Ignite
>>> > > > > > > > > platforms.
>>> > > > > > > > > > > > > > > You can look at an assessment of this
>>> algorithm
>>> > in
>>> > > my
>>> > > > > > > > > benchmark's
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > [1] https://issues.apache.org/
>>> > > > jira/browse/IGNITE-3592
>>> > > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев <
>>> > > > > > > > [hidden email]
>>> > > > > > > > > >:
>>> > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > Looks good for me.
>>> > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > Could You propose design of implementation
>>> in
>>> > > > couple
>>> > > > > of
>>> > > > > > > > > > > sentences?
>>> > > > > > > > > > > > > > > > So that we can estimate the completeness
>>> and
>>> > > > > complexity
>>> > > > > > > of
>>> > > > > > > > > the
>>> > > > > > > > > > > > > > proposal.
>>> > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav
>>> Daradur <
>>> > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > Anton,
>>> > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > Of course, the solution does not affect
>>> on
>>> > > > existing
>>> > > > > > > > > > > > > implementation. I
>>> > > > > > > > > > > > > > > > mean,
>>> > > > > > > > > > > > > > > > > there is no changes if user not use the
>>> > > > annotation
>>> > > > > > > > > > > > > > @BinaryCompression.
>>> > > > > > > > > > > > > > > > (no
>>> > > > > > > > > > > > > > > > > performance changes)
>>> > > > > > > > > > > > > > > > > Only if user make decision to use
>>> compression
>>> > > on
>>> > > > > > > specific
>>> > > > > > > > > > field
>>> > > > > > > > > > > > or
>>> > > > > > > > > > > > > > > fields
>>> > > > > > > > > > > > > > > > > of a class - in that case compression
>>> will be
>>> > > > used
>>> > > > > at
>>> > > > > > > > > > > marshalling
>>> > > > > > > > > > > > > in
>>> > > > > > > > > > > > > > > > > relation to annotated fields.
>>> > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон Чураев <
>>> > > > > > > > > > [hidden email]
>>> > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > Vyacheslav,
>>> > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > Is it possible to propose
>>> implementation
>>> > that
>>> > > > can
>>> > > > > > be
>>> > > > > > > > > > switched
>>> > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > on-demand?
>>> > > > > > > > > > > > > > > > > > In this case it should not affect
>>> > performance
>>> > > > of
>>> > > > > > > > current
>>> > > > > > > > > > > > > solution.
>>> > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > I mean, that users should make decision
>>> > what
>>> > > is
>>> > > > > > more
>>> > > > > > > > > > > important
>>> > > > > > > > > > > > > for
>>> > > > > > > > > > > > > > > > them:
>>> > > > > > > > > > > > > > > > > > throutput or memory/net usage.
>>> > > > > > > > > > > > > > > > > > May be they will be choose not all
>>> objects,
>>> > > or
>>> > > > > only
>>> > > > > > > > some
>>> > > > > > > > > > > > > attributes
>>> > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > objects for compress.
>>> > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00 Vyacheslav
>>> > > Daradur <
>>> > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > Conclusion:
>>> > > > > > > > > > > > > > > > > > > Provided solution allows reduce size
>>> of
>>> > an
>>> > > > > object
>>> > > > > > > in
>>> > > > > > > > > > > > > IgniteCache
>>> > > > > > > > > > > > > > at
>>> > > > > > > > > > > > > > > > the
>>> > > > > > > > > > > > > > > > > > > cost of throughput reduction (small
>>> - in
>>> > > some
>>> > > > > > > cases),
>>> > > > > > > > > it
>>> > > > > > > > > > > > > depends
>>> > > > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > part
>>> > > > > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > > object which will be compressed and
>>> > > > compression
>>> > > > > > > > > > algorithm.
>>> > > > > > > > > > > > > > > > > > > I mean, we can make more effective
>>> use of
>>> > > > > memory,
>>> > > > > > > and
>>> > > > > > > > > in
>>> > > > > > > > > > > some
>>> > > > > > > > > > > > > > cases
>>> > > > > > > > > > > > > > > > it
>>> > > > > > > > > > > > > > > > > > can
>>> > > > > > > > > > > > > > > > > > > reduce loading of the interconnect.
>>> > > > > (replication,
>>> > > > > > > > > > > > rebalancing)
>>> > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > Especially, it will be particularly
>>> > useful
>>> > > > for
>>> > > > > > > > object's
>>> > > > > > > > > > > > fields
>>> > > > > > > > > > > > > > > which
>>> > > > > > > > > > > > > > > > > are
>>> > > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and can be
>>> > > > > effectively
>>> > > > > > > > > > > compressed.
>>> > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00 Антон
>>> Чураев <
>>> > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But could
>>> you
>>> > > please
>>> > > > > > > > provide a
>>> > > > > > > > > > > > > > conclusions
>>> > > > > > > > > > > > > > > > or
>>> > > > > > > > > > > > > > > > > > > > proposals based on this benchmarks?
>>> > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00
>>> Vyacheslav
>>> > > > > Daradur <
>>> > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > Dmitry,
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > Excel-pages:
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)" -
>>> shows
>>> > > > object
>>> > > > > > > size,
>>> > > > > > > > > with
>>> > > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
>>> > > literal
>>> > > > > > text)
>>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>>> ratios of
>>> > > > using
>>> > > > > > > > > different
>>> > > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > > > > algrithms
>>> > > > > > > > > > > > > > > > > > > > > depending on size of compressed
>>> > field.
>>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of
>>> size of
>>> > > > > objects
>>> > > > > > > > > > depending
>>> > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > sizes
>>> > > > > > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)" -
>>> shows
>>> > > > object
>>> > > > > > > size,
>>> > > > > > > > > with
>>> > > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > > > > > without compression. (Conditions:
>>> > > badly
>>> > > > > > > > compressed
>>> > > > > > > > > > > > > character
>>> > > > > > > > > > > > > > > > > > sequence)
>>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>>> ratios of
>>> > > > using
>>> > > > > > > > > different
>>> > > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > > > > > algrithms depending on size of
>>> > > compressed
>>> > > > > > > field.
>>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation of
>>> size of
>>> > > > > objects
>>> > > > > > > > > > depending
>>> > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > sizes
>>> > > > > > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows average
>>> time of
>>> > > the
>>> > > > > > "put"
>>> > > > > > > > > > > operation
>>> > > > > > > > > > > > > > > > depending
>>> > > > > > > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > > > > size
>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows
>>> throughput of
>>> > > the
>>> > > > > > "put"
>>> > > > > > > > > > > operation
>>> > > > > > > > > > > > > > > > depending
>>> > > > > > > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > > > > size
>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows average
>>> time of
>>> > > the
>>> > > > > > "get"
>>> > > > > > > > > > > operation
>>> > > > > > > > > > > > > > > > depending
>>> > > > > > > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > > > > size
>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows
>>> throughput of
>>> > > the
>>> > > > > > "get"
>>> > > > > > > > > > > operation
>>> > > > > > > > > > > > > > > > depending
>>> > > > > > > > > > > > > > > > > on
>>> > > > > > > > > > > > > > > > > > > > size
>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00
>>> Dmitriy
>>> > > > > Setrakyan
>>> > > > > > <
>>> > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure how to
>>> > > > interpret
>>> > > > > > the
>>> > > > > > > > > > graphs?
>>> > > > > > > > > > > > What
>>> > > > > > > > > > > > > > are
>>> > > > > > > > > > > > > > > > we
>>> > > > > > > > > > > > > > > > > > > > looking
>>> > > > > > > > > > > > > > > > > > > > > > at?
>>> > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at 12:33
>>> AM,
>>> > > > > Vyacheslav
>>> > > > > > > > > > Daradur <
>>> > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > wrote:
>>> > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > I've prepared some
>>> benchmarking.
>>> > > > > Results
>>> > > > > > > [1].
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > And I've prepared the
>>> evaluation
>>> > in
>>> > > > the
>>> > > > > > > form
>>> > > > > > > > of
>>> > > > > > > > > > > > > diagrams
>>> > > > > > > > > > > > > > > [2].
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > I hope that helps to
>>> interest the
>>> > > > > > community
>>> > > > > > > > and
>>> > > > > > > > > > > > > > > accelerates a
>>> > > > > > > > > > > > > > > > > > > > reaction
>>> > > > > > > > > > > > > > > > > > > > > to
>>> > > > > > > > > > > > > > > > > > > > > > > this improvment :)
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > [1]
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> https://github.com/daradurvs/
>>> > > > > > > > > > > > ignite-compression/tree/
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> master/src/main/resources/result
>>> > > > > > > > > > > > > > > > > > > > > > > [2]
>>> > https://drive.google.com/file/
>>> > > d/
>>> > > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
>>> > > > > > > > > > > > > > > > > > > > view
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49 GMT+03:00
>>> > > Vyacheslav
>>> > > > > > > Daradur
>>> > > > > > > > <
>>> > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
>>> > > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40 GMT+03:00
>>> > > > Vyacheslav
>>> > > > > > > > > Daradur <
>>> > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR to
>>> show
>>> > my
>>> > > > > idea.
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> https://github.com/apache/
>>> > > > > > > > > > > ignite/pull/1951/files
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> About querying - I've just
>>> > > copied
>>> > > > > > > existing
>>> > > > > > > > > > tests
>>> > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > have
>>> > > > > > > > > > > > > > > > > > > > annotated
>>> > > > > > > > > > > > > > > > > > > > > > the
>>> > > > > > > > > > > > > > > > > > > > > > > >> testing data.
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> https://github.com/apache/
>>> > > > > > > > > > > > > > ignite/pull/1951/files#diff-
>>> > > > > > > > > > > > > > > > > c19a9d
>>> > > > > > > > > > > > > > > > > > > > > > > >> f4058141d059bb577e75244764
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> It means fields which
>>> will be
>>> > > > marked
>>> > > > > > by
>>> > > > > > > > > > > > > > > @BinaryCompression
>>> > > > > > > > > > > > > > > > > > will
>>> > > > > > > > > > > > > > > > > > > be
>>> > > > > > > > > > > > > > > > > > > > > > > >> compressed at marshalling
>>> via
>>> > > > > > > > > > BinaryMarshaller.
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> This solution has no
>>> effect on
>>> > > > > > existing
>>> > > > > > > > data
>>> > > > > > > > > > or
>>> > > > > > > > > > > > > > project
>>> > > > > > > > > > > > > > > > > > > > > architecture.
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see your
>>> > > thougths.
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18 GMT+03:00
>>> > > > > Vyacheslav
>>> > > > > > > > > Daradur
>>> > > > > > > > > > <
>>> > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>> I have ready prototype. I
>>> > want
>>> > > to
>>> > > > > > show
>>> > > > > > > > it.
>>> > > > > > > > > > > > > > > > > > > > > > > >>> It is always easier to
>>> > discuss
>>> > > on
>>> > > > > > > > example.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02
>>> GMT+03:00
>>> > > > Dmitriy
>>> > > > > > > > > Setrakyan
>>> > > > > > > > > > <
>>> > > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a bit
>>> > premature
>>> > > to
>>> > > > > > > > provide a
>>> > > > > > > > > > PR
>>> > > > > > > > > > > > > > without
>>> > > > > > > > > > > > > > > > > > getting
>>> > > > > > > > > > > > > > > > > > > a
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> community
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the dev
>>> list.
>>> > > > Please
>>> > > > > > > allow
>>> > > > > > > > > some
>>> > > > > > > > > > > > time
>>> > > > > > > > > > > > > > for
>>> > > > > > > > > > > > > > > > the
>>> > > > > > > > > > > > > > > > > > > > > community
>>> > > > > > > > > > > > > > > > > > > > > > to
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> respond.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> D.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15, 2017 at
>>> 6:36
>>> > > AM,
>>> > > > > > > > > Vyacheslav
>>> > > > > > > > > > > > > Daradur
>>> > > > > > > > > > > > > > <
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> [hidden email]>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I created the ticket:
>>> > > > > > > > > > > > > > > https://issues.apache.org/jira
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> /browse/IGNITE-5226
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a PR with
>>> > > > described
>>> > > > > > > > > solution
>>> > > > > > > > > > in
>>> > > > > > > > > > > > > > couple
>>> > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > days.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05
>>> GMT+03:00
>>> > > > > > > Vyacheslav
>>> > > > > > > > > > > Daradur
>>> > > > > > > > > > > > <
>>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > > > >:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is
>>> released.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue the
>>> > > > discussion
>>> > > > > > > about
>>> > > > > > > > a
>>> > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > > design.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment, I
>>> found
>>> > > only
>>> > > > > one
>>> > > > > > > > > solution
>>> > > > > > > > > > > > which
>>> > > > > > > > > > > > > > is
>>> > > > > > > > > > > > > > > > > > > compatible
>>> > > > > > > > > > > > > > > > > > > > > > with
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > querying
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing, this
>>> is
>>> > > > > > > > > per-objects-field
>>> > > > > > > > > > > > > > > compression.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields
>>> compression
>>> > > means
>>> > > > > > that
>>> > > > > > > > > > metadata
>>> > > > > > > > > > > > (a
>>> > > > > > > > > > > > > > > > header)
>>> > > > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > > an
>>> > > > > > > > > > > > > > > > > > > > > > object
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> won't
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed, only
>>> > > > serialized
>>> > > > > > > > values
>>> > > > > > > > > of
>>> > > > > > > > > > > an
>>> > > > > > > > > > > > > > object
>>> > > > > > > > > > > > > > > > > > fields
>>> > > > > > > > > > > > > > > > > > > > (in
>>> > > > > > > > > > > > > > > > > > > > > > > bytes
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> array
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
>>> > compressed.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution have
>>> some
>>> > > > > > > contentious
>>> > > > > > > > > > > issues:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - small values, like
>>> > > > > primitives
>>> > > > > > > and
>>> > > > > > > > > > short
>>> > > > > > > > > > > > > > arrays -
>>> > > > > > > > > > > > > > > > > there
>>> > > > > > > > > > > > > > > > > > > > isn't
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> sense to
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no
>>> possible
>>> > to
>>> > > > use
>>> > > > > > > > > > compression
>>> > > > > > > > > > > > with
>>> > > > > > > > > > > > > > > > > > > > java-predefined
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> types;
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide an
>>> > > > annotation,
>>> > > > > > > > > > > > > > @IgniteCompression -
>>> > > > > > > > > > > > > > > > for
>>> > > > > > > > > > > > > > > > > > > > > example,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> which can
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by users for
>>> > > marking
>>> > > > > > > fields
>>> > > > > > > > to
>>> > > > > > > > > > > > > compress.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone
>>> already
>>> > have
>>> > > > > ready
>>> > > > > > > > > design?
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10 11:06
>>> > GMT+03:00
>>> > > > > > > > Vyacheslav
>>> > > > > > > > > > > > Daradur
>>> > > > > > > > > > > > > <
>>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've read it.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's discuss
>>> about
>>> > > > > public
>>> > > > > > > API
>>> > > > > > > > > > > design.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we need to
>>> add
>>> > > > some a
>>> > > > > > > > > configure
>>> > > > > > > > > > > > > entity
>>> > > > > > > > > > > > > > to
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> CacheConfiguration,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will contain
>>> the
>>> > > > > > Compressor
>>> > > > > > > > > > > interface
>>> > > > > > > > > > > > > > > > > > > implementation
>>> > > > > > > > > > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > > > > > > > some
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to
>>> provide a
>>> > > > > > > > > BinaryMarshaller
>>> > > > > > > > > > > > > > decorator,
>>> > > > > > > > > > > > > > > > > which
>>> > > > > > > > > > > > > > > > > > > > will
>>> > > > > > > > > > > > > > > > > > > > > be
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> compress
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after
>>> marshalling.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10 10:40
>>> > > GMT+03:00
>>> > > > > > Alexey
>>> > > > > > > > > > > > Kuznetsov <
>>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read
>>> initial
>>> > > > > > discussion
>>> > > > > > > > [1]
>>> > > > > > > > > > > about
>>> > > > > > > > > > > > > > > > > compression?
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I
>>> remember
>>> > we
>>> > > > > agreed
>>> > > > > > > to
>>> > > > > > > > > add
>>> > > > > > > > > > > only
>>> > > > > > > > > > > > > > some
>>> > > > > > > > > > > > > > > > > > > > "top-level"
>>> > > > > > > > > > > > > > > > > > > > > > API
>>> > > > > > > > > > > > > > > > > > > > > > > in
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > order
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a way for
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users to
>>> inject
>>> > > > some
>>> > > > > > sort
>>> > > > > > > > of
>>> > > > > > > > > > > custom
>>> > > > > > > > > > > > > > > > > > compression.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > http://apache-ignite-developer
>>> > > > > > > > > > > > > > > s.2346864.n4.nabble
>>> > > > > > > > > > > > > > > > .
>>> > > > > > > > > > > > > > > > > > > > > com/Data-c
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > ompression-in-Ignite-2-0-
>>> > > > > > > > > td10099.html
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr 10,
>>> 2017
>>> > at
>>> > > > 2:19
>>> > > > > > PM,
>>> > > > > > > > > > > > daradurvs <
>>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>> > > > > > > > > > > > > > > > > > > > > > > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi Igniters!
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am interested
>>> in
>>> > > this
>>> > > > > > task.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide some
>>> kind of
>>> > > > > > pluggable
>>> > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > SPI
>>> > > > > > > > > > > > > > > > > > support
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
>>> > > > > https://issues.apache.org/
>>> > > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I developed a
>>> > solution
>>> > > > on
>>> > > > > > > > > > > > > > > > BinaryMarshaller-level,
>>> > > > > > > > > > > > > > > > > > but
>>> > > > > > > > > > > > > > > > > > > > > > reviewer
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> has
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's continue
>>> > > > discussion
>>> > > > > of
>>> > > > > > > > task
>>> > > > > > > > > > > goals
>>> > > > > > > > > > > > > and
>>> > > > > > > > > > > > > > > > > solution
>>> > > > > > > > > > > > > > > > > > > > > design.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I understood
>>> > that,
>>> > > > the
>>> > > > > > main
>>> > > > > > > > > goal
>>> > > > > > > > > > of
>>> > > > > > > > > > > > > this
>>> > > > > > > > > > > > > > > task
>>> > > > > > > > > > > > > > > > > is
>>> > > > > > > > > > > > > > > > > > to
>>> > > > > > > > > > > > > > > > > > > > > store
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> data in
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed form.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is what I
>>> need
>>> > > from
>>> > > > > > > Ignite
>>> > > > > > > > as
>>> > > > > > > > > > its
>>> > > > > > > > > > > > > user.
>>> > > > > > > > > > > > > > > > > > > Compression
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> provides
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can store
>>> more
>>> > data
>>> > > > on
>>> > > > > > same
>>> > > > > > > > > > servers
>>> > > > > > > > > > > > at
>>> > > > > > > > > > > > > > the
>>> > > > > > > > > > > > > > > > cost
>>> > > > > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > utilization.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm researching
>>> a
>>> > > > > > possibility
>>> > > > > > > of
>>> > > > > > > > > > > > > > > implementation
>>> > > > > > > > > > > > > > > > of
>>> > > > > > > > > > > > > > > > > > > > > > compression
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> at the
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > cache-level.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any thoughts?
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best regards,
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this
>>> message in
>>> > > > > > context:
>>> > > > > > > > > > > > > > > > > http://apache-ignite-
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > developers.2346864.n4.nabble.
>>> > > > > > > > > > > > > > > > > > com/Data-compression-in-
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > Ignite-2-0-tp10099p16317.html
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from the
>>> Apache
>>> > > > > Ignite
>>> > > > > > > > > > Developers
>>> > > > > > > > > > > > > > mailing
>>> > > > > > > > > > > > > > > > > list
>>> > > > > > > > > > > > > > > > > > > > > archive
>>> > > > > > > > > > > > > > > > > > > > > > at
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey Kuznetsov
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
>>> > Vyacheslav
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > --
>>> > > > > > > > > > > > > > > > > > > > > > >
>>>
>> ...
>
> [Message clipped]
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

dsetrakyan
I would prefer that we reuse an existing compression protocol, but at the table level.

If not possible, then we should go with a shared mapping approach. Any idea how hard?

⁣D.​

On Aug 1, 2017, 11:15 AM, at 11:15 AM, Vladimir Ozerov <[hidden email]> wrote:

>Vyacheslav,
>
>This is not about my needs, but about the product :-) BinaryObject is a
>central entity used for both data transfer and data storage. This is
>both
>good and bad at the same time.
>
>Good thing is that as we optimize binary protocol, we improve both
>network
>and storage performance at the same time. We have at least 3 things
>which
>will be included into the product soon: varint encoding [1], optimized
>string encoding [2] and null-field optimization [3]. Bad thing is that
>binary object format is not well suited for data storage optimizations,
>including compression. For example, one good compression technique is
>to
>organize data in column-store format, or to introduce shared
>"dictionary"
>with unique values on cache level. In both cases N equal values are not
>stored N times. Instead, we store one value and N references to it, or
>so.
>This way 2x-10x compression is possible depending on workload type.
>Binary
>object protocol with some compression on top of it cannot give such
>improvement, because it will compress data in individual objects,
>instead
>of compressing the whole cache data in a single context.
>
>That said, I propose to give up adding compression to BinaryObject.
>This is
>a dead end. Instead, we should:
>1) Optimize protocol itself to be more compact, as described in
>aforementioned Ignite tickets
>2) Start new discussion about storage compression
>
>You can read papers of other vendors to get better understanding on
>possible compression options. E.g. Oracle has a lot of compression
>techniques, including heat maps, background compression, per-block
>compression, data dictionaries, etc. [4].
>
>[1] https://issues.apache.org/jira/browse/IGNITE-5097
>[2] https://issues.apache.org/jira/browse/IGNITE-5655
>[3] https://issues.apache.org/jira/browse/IGNITE-3939
>[4]
>http://www.oracle.com/technetwork/database/options/compression/advanced-
>compression-wp-12c-1896128.pdf
>
>Vladimir.
>
>
>On Tue, Jul 11, 2017 at 6:56 PM, Vyacheslav Daradur
><[hidden email]>
>wrote:
>
>> Hi Igniters!
>>
>> I'd like to continue developing and discussing about compression in
>Ignite.
>>
>> Vladimir, could you propose a design of compression feature in
>Ignite,
>> that suits you?
>>
>> 2017-06-15 16:13 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
>>
>>> Hi Igniters.
>>>
>>> Vladimir, I want to propose another design of an implementation of
>the
>>> per-field compression.
>>>
>>> 1) We will add new step in the method prepareForCache (for example)
>of
>>> CacheObject, or in GridCacheMapEntry.
>>>
>>> At the step, after marshalling of an object, we will compress fields
>of
>>> the object which described in advance.
>>> User will describe class fields which he wants to compess in an
>another
>>> entity like Metadata.
>>>
>>> For compression, we will introduce another entity, for example
>>> CompressionProcessor, which will work with bytes array (marshalled
>object).
>>> The entity will read bytes array of described fields, compress it
>and
>>> rewrite binary representation of the whole object.
>>> After processing the object will be put in the cache.
>>>
>>> In this case design not to relate to binary infrastructure.
>>> But there is big overhead to heap-memory for the buffer.
>>>
>>> 2) Another solution is to compress bytes array of whole object on
>copying
>>> to off-heap.
>>> But, in this case I don't understand yet, how to provide support of
>>> querying and indexing.
>>>
>>>
>>> 2017-06-09 11:21 GMT+03:00 Sergey Kozlov <[hidden email]>:
>>>
>>>> Hi
>>>>
>>>> * "Per-field compression" is applicable for huge BLOB fields and
>will
>>>> impose the restrictions like unable ot index such fields, slower
>getting
>>>> data, potential OOM issues if compression ration is too high.
>>>> But for some cases it makes sense
>>>>
>>>> On Fri, Jun 9, 2017 at 11:11 AM, Антон Чураев
><[hidden email]>
>>>> wrote:
>>>>
>>>> > Seems that Dmitry is referring to transparent data encryption. It
>is
>>>> used
>>>> > throughout the whale database industry.
>>>> >
>>>> > 2017-06-09 10:50 GMT+03:00 Vladimir Ozerov
><[hidden email]>:
>>>> >
>>>> > > Dima,
>>>> > >
>>>> > > Encryption of certain fields is as bad as compression. First,
>it is a
>>>> > huge
>>>> > > change, which makes already complex binary protocol even more
>>>> complex.
>>>> > > Second, it have to be ported to CPP, .NET platforms, as well as
>to
>>>> JDBC
>>>> > and
>>>> > > ODBC.
>>>> > > Last, but the most important - this is not our headache to
>encrypt
>>>> > > sensitive data. This is user responsibility. Nobody in a sane
>mind
>>>> will
>>>> > > store passwords in plain form. Instead, user should encrypt it
>on his
>>>> > own,
>>>> > > choosing proper encryption parameters - algorithms, key
>lengths,
>>>> salts,
>>>> > > etc.. How are you going to expose this in API or configuration?
>>>> > >
>>>> > > We should not implement data encryption on binary level, this
>is out
>>>> of
>>>> > > question. Encryption should be implemented on application level
>(user
>>>> > > efforts), transport layer (SSL - we already have it), and
>possibly on
>>>> > > disk-level (there are tools for this already).
>>>> > >
>>>> > >
>>>> > > On Fri, Jun 9, 2017 at 9:06 AM, Vyacheslav Daradur <
>>>> [hidden email]>
>>>> > > wrote:
>>>> > >
>>>> > > > >> which is much less useful.
>>>> > > > I note, in some cases there is profit more than twice per
>size of
>>>> an
>>>> > > > object.
>>>> > > >
>>>> > > > >> Would it be possible to change your implementation to
>handle the
>>>> > > > encryption instead?
>>>> > > > Yes, of cource, there's not much difference between
>compression and
>>>> > > > encryption, including in my implementation of
>>>> per-field-compression.
>>>> > > >
>>>> > > > 2017-06-09 8:55 GMT+03:00 Dmitriy Setrakyan
><[hidden email]
>>>> >:
>>>> > > >
>>>> > > > > Vyacheslav,
>>>> > > > >
>>>> > > > > When this feature started out as data compression in
>Ignite, it
>>>> > sounded
>>>> > > > > very useful. Now it is unfolding as a per-field
>compression,
>>>> which is
>>>> > > > much
>>>> > > > > less useful. In fact, it is questionable whether it is
>useful at
>>>> all.
>>>> > > The
>>>> > > > > fact that this feature is implemented does not make it
>mandatory
>>>> for
>>>> > > the
>>>> > > > > community to accept it.
>>>> > > > >
>>>> > > > > However, as I mentioned before, per-field encryption is
>very
>>>> useful,
>>>> > as
>>>> > > > it
>>>> > > > > would allow users automatically encrypt certain sensitive
>fields,
>>>> > like
>>>> > > > > passwords, credit card numbers, etc. There is not much
>conceptual
>>>> > > > > difference between compressing a field vs encrypting a
>field.
>>>> Would
>>>> > it
>>>> > > be
>>>> > > > > possible to change your implementation to handle the
>encryption
>>>> > > instead?
>>>> > > > >
>>>> > > > > D.
>>>> > > > >
>>>> > > > > On Thu, Jun 8, 2017 at 10:42 PM, Vyacheslav Daradur <
>>>> > > [hidden email]
>>>> > > > >
>>>> > > > > wrote:
>>>> > > > >
>>>> > > > > > Guys, I want to be clear:
>>>> > > > > > * "Per-field compression" design is the result of a
>research
>>>> of the
>>>> > > > > binary
>>>> > > > > > infrastructure of Ignite and some other its places
>(querying,
>>>> > > indexing,
>>>> > > > > > etc.)
>>>> > > > > > * Full-compression of object will be more effective, but
>in
>>>> this
>>>> > case
>>>> > > > > there
>>>> > > > > > is no capability with querying and indexing (or there is
>large
>>>> > > overhead
>>>> > > > > by
>>>> > > > > > way of decompressing of full object (or caches pages) on
>>>> demand)
>>>> > > > > > * "Per-field compression" is a one of ways to implement
>the
>>>> > > compression
>>>> > > > > > feature
>>>> > > > > >
>>>> > > > > > I'm new to Ignite also I can be mistaken in some things.
>>>> > > > > > Last 3-4 month I've tryed to start dicussion about a
>design,
>>>> but
>>>> > > nobody
>>>> > > > > > answers nothing (except Dmitry and Valentin who was
>interested
>>>> how
>>>> > it
>>>> > > > > > works).
>>>> > > > > > But I understand that this is community and nobody is
>obliged
>>>> to
>>>> > > > anybody.
>>>> > > > > >
>>>> > > > > > There are strong Ignite experts.
>>>> > > > > > If they can help me and community with a design of the
>>>> compression
>>>> > > > > feature
>>>> > > > > > it will be great.
>>>> > > > > > At the moment I have a desire and time to be engaged in
>>>> development
>>>> > > of
>>>> > > > > > compression feature in Ignite.
>>>> > > > > > Let's use this opportunity :)
>>>> > > > > >
>>>> > > > > > 2017-06-09 5:36 GMT+03:00 Dmitriy Setrakyan <
>>>> [hidden email]
>>>> > >:
>>>> > > > > >
>>>> > > > > > > Igniters,
>>>> > > > > > >
>>>> > > > > > > I have never seen a single Ignite user asking about
>>>> compressing a
>>>> > > > > single
>>>> > > > > > > field. However, we have had requests to secure certain
>>>> fields,
>>>> > e.g.
>>>> > > > > > > passwords.
>>>> > > > > > >
>>>> > > > > > > I personally do not think per-field compression is
>needed,
>>>> unless
>>>> > > we
>>>> > > > > can
>>>> > > > > > > point out some concrete real life use cases.
>>>> > > > > > >
>>>> > > > > > > D.
>>>> > > > > > >
>>>> > > > > > > On Thu, Jun 8, 2017 at 3:42 AM, Vyacheslav Daradur <
>>>> > > > > [hidden email]>
>>>> > > > > > > wrote:
>>>> > > > > > >
>>>> > > > > > > > Anton,
>>>> > > > > > > >
>>>> > > > > > > > >> I thought that if there will storing compressed
>data in
>>>> the
>>>> > > > > memory,
>>>> > > > > > > data
>>>> > > > > > > > >> will transmit over wire in compression too. Is it
>right?
>>>> > > > > > > >
>>>> > > > > > > > In per-field compression case - yes.
>>>> > > > > > > >
>>>> > > > > > > > 2017-06-08 13:36 GMT+03:00 Антон Чураев <
>>>> [hidden email]
>>>> > >:
>>>> > > > > > > >
>>>> > > > > > > > > Guys, could you please help me.
>>>> > > > > > > > > I thought that if there will storing compressed
>data in
>>>> the
>>>> > > > memory,
>>>> > > > > > > data
>>>> > > > > > > > > will transmit over wire in compression too. Is it
>right?
>>>> > > > > > > > >
>>>> > > > > > > > > 2017-06-08 13:30 GMT+03:00 Vyacheslav Daradur <
>>>> > > > [hidden email]
>>>> > > > > >:
>>>> > > > > > > > >
>>>> > > > > > > > > > Vladimir,
>>>> > > > > > > > > >
>>>> > > > > > > > > > The main problem which I'am trying to solve is
>storing
>>>> data
>>>> > > in
>>>> > > > > > memory
>>>> > > > > > > > in
>>>> > > > > > > > > a
>>>> > > > > > > > > > compression form via Ignite.
>>>> > > > > > > > > > The main goal is using memory more effectivelly.
>>>> > > > > > > > > >
>>>> > > > > > > > > > >> here the much simpler step would be to full
>>>> > > > > > > > > > compression on per-cache basis rather than
>dealing with
>>>> > > > > per-fields
>>>> > > > > > > > case.
>>>> > > > > > > > > >
>>>> > > > > > > > > > Please explain your idea. Compess data by
>memory-page?
>>>> > > > > > > > > > Is it compatible with quering and indexing?
>>>> > > > > > > > > >
>>>> > > > > > > > > > >> In the end, if user would like to compress
>>>> particular
>>>> > > field,
>>>> > > > > he
>>>> > > > > > > can
>>>> > > > > > > > > > always to it on his own
>>>> > > > > > > > > > I think we mustn't think in this way, if user
>need
>>>> > something
>>>> > > he
>>>> > > > > > > trying
>>>> > > > > > > > to
>>>> > > > > > > > > > choose a tool which has this feature OOTB.
>>>> > > > > > > > > >
>>>> > > > > > > > > >
>>>> > > > > > > > > >
>>>> > > > > > > > > > 2017-06-08 12:53 GMT+03:00 Vladimir Ozerov <
>>>> > > > [hidden email]
>>>> > > > > >:
>>>> > > > > > > > > >
>>>> > > > > > > > > > > Igniters,
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > Honestly I still do not see how to apply it
>>>> gracefully
>>>> > this
>>>> > > > > > feature
>>>> > > > > > > > ti
>>>> > > > > > > > > > > Ignite. And overall approach to compress only
>>>> particular
>>>> > > > fields
>>>> > > > > > > looks
>>>> > > > > > > > > > > overcomplicated to me. Remember, that our main
>use
>>>> case
>>>> > is
>>>> > > an
>>>> > > > > > > > > application
>>>> > > > > > > > > > > without classes on the server. It means that
>any
>>>> kind of
>>>> > > > > > > annotations
>>>> > > > > > > > > are
>>>> > > > > > > > > > > inapplicable. To be more precise: proper API
>should
>>>> be
>>>> > > > > > implemented
>>>> > > > > > > to
>>>> > > > > > > > > > > handle no-class case (e.g. how would build such
>an
>>>> object
>>>> > > > > through
>>>> > > > > > > > > > > BinaryBuilder without a class?), and only then
>add
>>>> > > > annotations
>>>> > > > > as
>>>> > > > > > > > > > > convenient addition to more basic API.
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > It seems to me that full implementation, which
>takes
>>>> in
>>>> > > count
>>>> > > > > > > proper
>>>> > > > > > > > > > > "classless" API, changes to binary metadata to
>>>> reflect
>>>> > > > > compressed
>>>> > > > > > > > > fields,
>>>> > > > > > > > > > > changes to SQL, changes to binary protocol, and
>>>> porting
>>>> > to
>>>> > > > .NET
>>>> > > > > > and
>>>> > > > > > > > > CPP,
>>>> > > > > > > > > > > will yield very complex solution with little
>value
>>>> to the
>>>> > > > > > product.
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > Instead, as I proposed earlier, it seems that
>we'd
>>>> better
>>>> > > > start
>>>> > > > > > > with
>>>> > > > > > > > > the
>>>> > > > > > > > > > > problem we are trying to solve. Basically,
>>>> compression
>>>> > > could
>>>> > > > > help
>>>> > > > > > > in
>>>> > > > > > > > > two
>>>> > > > > > > > > > > cases:
>>>> > > > > > > > > > > 1) Transmitting data over wire - it should be
>>>> implemented
>>>> > > on
>>>> > > > > > > > > > communication
>>>> > > > > > > > > > > layer and should not affect binary
>serialization
>>>> > component
>>>> > > a
>>>> > > > > lot.
>>>> > > > > > > > > > > 2) Storing data in memory - here the much
>simpler
>>>> step
>>>> > > would
>>>> > > > be
>>>> > > > > > to
>>>> > > > > > > > full
>>>> > > > > > > > > > > compression on per-cache basis rather than
>dealing
>>>> with
>>>> > > > > > per-fields
>>>> > > > > > > > > case.
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > In the end, if user would like to compress
>particular
>>>> > > field,
>>>> > > > he
>>>> > > > > > can
>>>> > > > > > > > > > always
>>>> > > > > > > > > > > to it on his own, and set already compressed
>field
>>>> to our
>>>> > > > > > > > BinaryObject.
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > Vladimir.
>>>> > > > > > > > > > >
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > On Thu, Jun 8, 2017 at 12:37 PM, Vyacheslav
>Daradur <
>>>> > > > > > > > > [hidden email]
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > wrote:
>>>> > > > > > > > > > >
>>>> > > > > > > > > > > > Valentin,
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > > Yes, I have the prototype[1][2]
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > > You can see an example of Java class[3] that
>I
>>>> used in
>>>> > my
>>>> > > > > > > > benchmark.
>>>> > > > > > > > > > > > For example:
>>>> > > > > > > > > > > > class Foo {
>>>> > > > > > > > > > > > @BinaryCompression
>>>> > > > > > > > > > > > String data;
>>>> > > > > > > > > > > > }
>>>> > > > > > > > > > > > If user make decision to store the object in
>>>> compressed
>>>> > > > form,
>>>> > > > > > he
>>>> > > > > > > > can
>>>> > > > > > > > > > use
>>>> > > > > > > > > > > > the annotation @BinaryCompression as shown
>above.
>>>> > > > > > > > > > > > It means annotated field 'data' will be
>compressed
>>>> at
>>>> > > > > > > marshalling.
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > > [1]
>https://github.com/apache/ignite/pull/1951
>>>> > > > > > > > > > > > [2] https://issues.apache.org/jira
>>>> /browse/IGNITE-5226
>>>> > > > > > > > > > > > [3]
>>>> > > > > > > > > > > > https://github.com/daradurvs/i
>>>> gnite-compression/blob/
>>>> > > > > > > > > > > > master/src/main/java/ru/daradu
>>>> rvs/ignite/compression/
>>>> > > > > > > > > > model/Audit1F.java
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > > 2017-06-08 2:04 GMT+03:00 Valentin Kulichenko
><
>>>> > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > >:
>>>> > > > > > > > > > > >
>>>> > > > > > > > > > > > > Vyacheslav, Anton,
>>>> > > > > > > > > > > > >
>>>> > > > > > > > > > > > > Are there any ideas and/or prototypes for
>the
>>>> API?
>>>> > Your
>>>> > > > > > design
>>>> > > > > > > > > > > > suggestions
>>>> > > > > > > > > > > > > seem to make sense, but I would like to see
>how
>>>> it
>>>> > all
>>>> > > > this
>>>> > > > > > > will
>>>> > > > > > > > > like
>>>> > > > > > > > > > > > from
>>>> > > > > > > > > > > > > user's standpoint.
>>>> > > > > > > > > > > > >
>>>> > > > > > > > > > > > > -Val
>>>> > > > > > > > > > > > >
>>>> > > > > > > > > > > > > On Wed, Jun 7, 2017 at 1:06 AM, Антон
>Чураев <
>>>> > > > > > > > [hidden email]
>>>> > > > > > > > > >
>>>> > > > > > > > > > > > wrote:
>>>> > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > Vyacheslav, correct me if something wrong
>>>> > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > We could provide opportunity of choose
>between
>>>> CPU
>>>> > > > usage
>>>> > > > > > and
>>>> > > > > > > > > > MEM/NET
>>>> > > > > > > > > > > > > usage
>>>> > > > > > > > > > > > > > for users by compression some attributes
>of
>>>> stored
>>>> > > > > objects.
>>>> > > > > > > > > > > > > > You have learned design, and it is
>possible to
>>>> > > localize
>>>> > > > > > > changes
>>>> > > > > > > > > in
>>>> > > > > > > > > > > > > > marshalling without performance affect
>and
>>>> current
>>>> > > > > > > > functionality.
>>>> > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > I think, that it's usefull for our
>project and
>>>> > users.
>>>> > > > > > > > > > > > > > Community, what do you think about this
>>>> proposal?
>>>> > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > 2017-06-06 17:29 GMT+03:00 Vyacheslav
>Daradur <
>>>> > > > > > > > > [hidden email]
>>>> > > > > > > > > > >:
>>>> > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > In short,
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > During marshalling a fields is
>represented as
>>>> > > > > > > > > BinaryFieldAccessor
>>>> > > > > > > > > > > > which
>>>> > > > > > > > > > > > > > > manages its marshalling. It checks if
>the
>>>> field
>>>> > is
>>>> > > > > marked
>>>> > > > > > > by
>>>> > > > > > > > > > > > annotation
>>>> > > > > > > > > > > > > > > @BinaryCompression, in that case -
>binary
>>>> > > > > representation
>>>> > > > > > > of
>>>> > > > > > > > > > field
>>>> > > > > > > > > > > > > (bytes
>>>> > > > > > > > > > > > > > > array) will be compressed. It will be
>marked
>>>> as
>>>> > > > > > compressed
>>>> > > > > > > by
>>>> > > > > > > > > > types
>>>> > > > > > > > > > > > > > > constant
>(GridBinaryMarshaller.COMPRESSED),
>>>> > after
>>>> > > > this
>>>> > > > > > the
>>>> > > > > > > > > > > > compressed
>>>> > > > > > > > > > > > > > > bytes
>>>> > > > > > > > > > > > > > > array wiil be include in binary
>>>> representation of
>>>> > > > whole
>>>> > > > > > > > object.
>>>> > > > > > > > > > > Note,
>>>> > > > > > > > > > > > > > > header of marshalled object will not be
>>>> > compressed.
>>>> > > > > > > > Compression
>>>> > > > > > > > > > > > > affected
>>>> > > > > > > > > > > > > > > only object's field representation.
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > Objects in IgniteCache is represented
>as
>>>> > > BinaryObject
>>>> > > > > > which
>>>> > > > > > > > is
>>>> > > > > > > > > > > > wrapper
>>>> > > > > > > > > > > > > > over
>>>> > > > > > > > > > > > > > > bytes array of marshalled object.
>>>> > > > > > > > > > > > > > > BinaryObject provides some usefull
>methods,
>>>> which
>>>> > > are
>>>> > > > > > used
>>>> > > > > > > by
>>>> > > > > > > > > > > Ignite
>>>> > > > > > > > > > > > > > > systems.
>>>> > > > > > > > > > > > > > > For example, the Queries use
>>>> BinaryObject#field
>>>> > > > method,
>>>> > > > > > > which
>>>> > > > > > > > > > > > > > deserializes
>>>> > > > > > > > > > > > > > > only field of object, without
>deserializing
>>>> of
>>>> > > whole
>>>> > > > > > > object.
>>>> > > > > > > > > > > > > > > BinaryObject#field method during
>>>> deserialization,
>>>> > > if
>>>> > > > > > meets
>>>> > > > > > > > the
>>>> > > > > > > > > > > > constant
>>>> > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > compressed type, decompress this bytes
>array,
>>>> > then
>>>> > > > > > continue
>>>> > > > > > > > > > > > > unmarshalling
>>>> > > > > > > > > > > > > > > as usual.
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > Now, I introduced the Compressor
>interface in
>>>> > > > > > > > > > IgniteConfigurations,
>>>> > > > > > > > > > > > it
>>>> > > > > > > > > > > > > > > allows user to use own implementation
>of
>>>> > > compressor -
>>>> > > > > it
>>>> > > > > > is
>>>> > > > > > > > the
>>>> > > > > > > > > > > > > > requirement
>>>> > > > > > > > > > > > > > > in the task[1].
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > As far as I know, Vladimir Ozerov
>doesn't
>>>> like
>>>> > the
>>>> > > > idea
>>>> > > > > > of
>>>> > > > > > > > > > granting
>>>> > > > > > > > > > > > > this
>>>> > > > > > > > > > > > > > > opportunity to the user.
>>>> > > > > > > > > > > > > > > In that case we can choose a
>compression
>>>> > algorithm
>>>> > > > > which
>>>> > > > > > we
>>>> > > > > > > > > will
>>>> > > > > > > > > > > > > provide
>>>> > > > > > > > > > > > > > by
>>>> > > > > > > > > > > > > > > default and will move the interface to
>>>> internals
>>>> > of
>>>> > > > > > binary
>>>> > > > > > > > > > > > > > infractructure.
>>>> > > > > > > > > > > > > > > For this case I've prepared
>benchmarked,
>>>> which
>>>> > I've
>>>> > > > > sent
>>>> > > > > > > > > earlier.
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > I vote for ZSTD algorithm[2], it
>provides
>>>> good
>>>> > > > > > compression
>>>> > > > > > > > > ratio
>>>> > > > > > > > > > > and
>>>> > > > > > > > > > > > > good
>>>> > > > > > > > > > > > > > > throughput. It has implementation in
>Java,
>>>> .NET
>>>> > and
>>>> > > > > C++,
>>>> > > > > > > and
>>>> > > > > > > > > has
>>>> > > > > > > > > > > > > > > ASF-friendly license, we can use it in
>the
>>>> all
>>>> > > Ignite
>>>> > > > > > > > > platforms.
>>>> > > > > > > > > > > > > > > You can look at an assessment of this
>>>> algorithm
>>>> > in
>>>> > > my
>>>> > > > > > > > > benchmark's
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > [1] https://issues.apache.org/
>>>> > > > jira/browse/IGNITE-3592
>>>> > > > > > > > > > > > > > > [2]https://github.com/facebook/zstd
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > 2017-06-06 16:02 GMT+03:00 Антон Чураев
><
>>>> > > > > > > > [hidden email]
>>>> > > > > > > > > >:
>>>> > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > Looks good for me.
>>>> > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > Could You propose design of
>implementation
>>>> in
>>>> > > > couple
>>>> > > > > of
>>>> > > > > > > > > > > sentences?
>>>> > > > > > > > > > > > > > > > So that we can estimate the
>completeness
>>>> and
>>>> > > > > complexity
>>>> > > > > > > of
>>>> > > > > > > > > the
>>>> > > > > > > > > > > > > > proposal.
>>>> > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > 2017-06-06 15:26 GMT+03:00 Vyacheslav
>>>> Daradur <
>>>> > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > Anton,
>>>> > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > Of course, the solution does not
>affect
>>>> on
>>>> > > > existing
>>>> > > > > > > > > > > > > implementation. I
>>>> > > > > > > > > > > > > > > > mean,
>>>> > > > > > > > > > > > > > > > > there is no changes if user not use
>the
>>>> > > > annotation
>>>> > > > > > > > > > > > > > @BinaryCompression.
>>>> > > > > > > > > > > > > > > > (no
>>>> > > > > > > > > > > > > > > > > performance changes)
>>>> > > > > > > > > > > > > > > > > Only if user make decision to use
>>>> compression
>>>> > > on
>>>> > > > > > > specific
>>>> > > > > > > > > > field
>>>> > > > > > > > > > > > or
>>>> > > > > > > > > > > > > > > fields
>>>> > > > > > > > > > > > > > > > > of a class - in that case
>compression
>>>> will be
>>>> > > > used
>>>> > > > > at
>>>> > > > > > > > > > > marshalling
>>>> > > > > > > > > > > > > in
>>>> > > > > > > > > > > > > > > > > relation to annotated fields.
>>>> > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > 2017-06-06 15:10 GMT+03:00 Антон
>Чураев <
>>>> > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > Vyacheslav,
>>>> > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > Is it possible to propose
>>>> implementation
>>>> > that
>>>> > > > can
>>>> > > > > > be
>>>> > > > > > > > > > switched
>>>> > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > on-demand?
>>>> > > > > > > > > > > > > > > > > > In this case it should not affect
>>>> > performance
>>>> > > > of
>>>> > > > > > > > current
>>>> > > > > > > > > > > > > solution.
>>>> > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > I mean, that users should make
>decision
>>>> > what
>>>> > > is
>>>> > > > > > more
>>>> > > > > > > > > > > important
>>>> > > > > > > > > > > > > for
>>>> > > > > > > > > > > > > > > > them:
>>>> > > > > > > > > > > > > > > > > > throutput or memory/net usage.
>>>> > > > > > > > > > > > > > > > > > May be they will be choose not
>all
>>>> objects,
>>>> > > or
>>>> > > > > only
>>>> > > > > > > > some
>>>> > > > > > > > > > > > > attributes
>>>> > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > objects for compress.
>>>> > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > 2017-06-06 14:48 GMT+03:00
>Vyacheslav
>>>> > > Daradur <
>>>> > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > Conclusion:
>>>> > > > > > > > > > > > > > > > > > > Provided solution allows reduce
>size
>>>> of
>>>> > an
>>>> > > > > object
>>>> > > > > > > in
>>>> > > > > > > > > > > > > IgniteCache
>>>> > > > > > > > > > > > > > at
>>>> > > > > > > > > > > > > > > > the
>>>> > > > > > > > > > > > > > > > > > > cost of throughput reduction
>(small
>>>> - in
>>>> > > some
>>>> > > > > > > cases),
>>>> > > > > > > > > it
>>>> > > > > > > > > > > > > depends
>>>> > > > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > part
>>>> > > > > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > > object which will be compressed
>and
>>>> > > > compression
>>>> > > > > > > > > > algorithm.
>>>> > > > > > > > > > > > > > > > > > > I mean, we can make more
>effective
>>>> use of
>>>> > > > > memory,
>>>> > > > > > > and
>>>> > > > > > > > > in
>>>> > > > > > > > > > > some
>>>> > > > > > > > > > > > > > cases
>>>> > > > > > > > > > > > > > > > it
>>>> > > > > > > > > > > > > > > > > > can
>>>> > > > > > > > > > > > > > > > > > > reduce loading of the
>interconnect.
>>>> > > > > (replication,
>>>> > > > > > > > > > > > rebalancing)
>>>> > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > Especially, it will be
>particularly
>>>> > useful
>>>> > > > for
>>>> > > > > > > > object's
>>>> > > > > > > > > > > > fields
>>>> > > > > > > > > > > > > > > which
>>>> > > > > > > > > > > > > > > > > are
>>>> > > > > > > > > > > > > > > > > > > large text (>~ 250 bytes) and
>can be
>>>> > > > > effectively
>>>> > > > > > > > > > > compressed.
>>>> > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > 2017-06-06 12:00 GMT+03:00
>Антон
>>>> Чураев <
>>>> > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > Vyacheslav, thank you! But
>could
>>>> you
>>>> > > please
>>>> > > > > > > > provide a
>>>> > > > > > > > > > > > > > conclusions
>>>> > > > > > > > > > > > > > > > or
>>>> > > > > > > > > > > > > > > > > > > > proposals based on this
>benchmarks?
>>>> > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > 2017-06-06 11:28 GMT+03:00
>>>> Vyacheslav
>>>> > > > > Daradur <
>>>> > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > Dmitry,
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > Excel-pages:
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 1). "Compression ratio (2)"
>-
>>>> shows
>>>> > > > object
>>>> > > > > > > size,
>>>> > > > > > > > > with
>>>> > > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > > > > > without compression.
>(Conditions:
>>>> > > literal
>>>> > > > > > text)
>>>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>>>> ratios of
>>>> > > > using
>>>> > > > > > > > > different
>>>> > > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > > > > algrithms
>>>> > > > > > > > > > > > > > > > > > > > > depending on size of
>compressed
>>>> > field.
>>>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation
>of
>>>> size of
>>>> > > > > objects
>>>> > > > > > > > > > depending
>>>> > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > sizes
>>>> > > > > > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 2). "Compression ratio (1)"
>-
>>>> shows
>>>> > > > object
>>>> > > > > > > size,
>>>> > > > > > > > > with
>>>> > > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > > > > > without compression.
>(Conditions:
>>>> > > badly
>>>> > > > > > > > compressed
>>>> > > > > > > > > > > > > character
>>>> > > > > > > > > > > > > > > > > > sequence)
>>>> > > > > > > > > > > > > > > > > > > > > 1st graph shows compression
>>>> ratios of
>>>> > > > using
>>>> > > > > > > > > different
>>>> > > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > > > > > algrithms depending on size
>of
>>>> > > compressed
>>>> > > > > > > field.
>>>> > > > > > > > > > > > > > > > > > > > > 2nd graph shows evaluation
>of
>>>> size of
>>>> > > > > objects
>>>> > > > > > > > > > depending
>>>> > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > sizes
>>>> > > > > > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > > > > > compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 3) 'put-avg" - shows
>average
>>>> time of
>>>> > > the
>>>> > > > > > "put"
>>>> > > > > > > > > > > operation
>>>> > > > > > > > > > > > > > > > depending
>>>> > > > > > > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > > > > size
>>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 4) 'put-thrpt" - shows
>>>> throughput of
>>>> > > the
>>>> > > > > > "put"
>>>> > > > > > > > > > > operation
>>>> > > > > > > > > > > > > > > > depending
>>>> > > > > > > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > > > > size
>>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 5) 'get-avg" - shows
>average
>>>> time of
>>>> > > the
>>>> > > > > > "get"
>>>> > > > > > > > > > > operation
>>>> > > > > > > > > > > > > > > > depending
>>>> > > > > > > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > > > > size
>>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 6) 'get-thrpt" - shows
>>>> throughput of
>>>> > > the
>>>> > > > > > "get"
>>>> > > > > > > > > > > operation
>>>> > > > > > > > > > > > > > > > depending
>>>> > > > > > > > > > > > > > > > > on
>>>> > > > > > > > > > > > > > > > > > > > size
>>>> > > > > > > > > > > > > > > > > > > > > and compression algorithms.
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > 2017-06-06 10:59 GMT+03:00
>>>> Dmitriy
>>>> > > > > Setrakyan
>>>> > > > > > <
>>>> > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > Vladimir, I am not sure
>how to
>>>> > > > interpret
>>>> > > > > > the
>>>> > > > > > > > > > graphs?
>>>> > > > > > > > > > > > What
>>>> > > > > > > > > > > > > > are
>>>> > > > > > > > > > > > > > > > we
>>>> > > > > > > > > > > > > > > > > > > > looking
>>>> > > > > > > > > > > > > > > > > > > > > > at?
>>>> > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > On Tue, Jun 6, 2017 at
>12:33
>>>> AM,
>>>> > > > > Vyacheslav
>>>> > > > > > > > > > Daradur <
>>>> > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > wrote:
>>>> > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > Hi, Igniters.
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > I've prepared some
>>>> benchmarking.
>>>> > > > > Results
>>>> > > > > > > [1].
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > And I've prepared the
>>>> evaluation
>>>> > in
>>>> > > > the
>>>> > > > > > > form
>>>> > > > > > > > of
>>>> > > > > > > > > > > > > diagrams
>>>> > > > > > > > > > > > > > > [2].
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > I hope that helps to
>>>> interest the
>>>> > > > > > community
>>>> > > > > > > > and
>>>> > > > > > > > > > > > > > > accelerates a
>>>> > > > > > > > > > > > > > > > > > > > reaction
>>>> > > > > > > > > > > > > > > > > > > > > to
>>>> > > > > > > > > > > > > > > > > > > > > > > this improvment :)
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > [1]
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> https://github.com/daradurvs/
>>>> > > > > > > > > > > > ignite-compression/tree/
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> master/src/main/resources/result
>>>> > > > > > > > > > > > > > > > > > > > > > > [2]
>>>> > https://drive.google.com/file/
>>>> > > d/
>>>> > > > > > > > > > > > > > > > > > 0B2CeUAOgrHkoMklyZ25YTEdKcEk/
>>>> > > > > > > > > > > > > > > > > > > > view
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > 2017-05-24 9:49
>GMT+03:00
>>>> > > Vyacheslav
>>>> > > > > > > Daradur
>>>> > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > > Guys, any thoughts?
>>>> > > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > > 2017-05-16 13:40
>GMT+03:00
>>>> > > > Vyacheslav
>>>> > > > > > > > > Daradur <
>>>> > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >> Hi guys,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> I've prepared the PR
>to
>>>> show
>>>> > my
>>>> > > > > idea.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> https://github.com/apache/
>>>> > > > > > > > > > > ignite/pull/1951/files
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> About querying -
>I've just
>>>> > > copied
>>>> > > > > > > existing
>>>> > > > > > > > > > tests
>>>> > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > have
>>>> > > > > > > > > > > > > > > > > > > > annotated
>>>> > > > > > > > > > > > > > > > > > > > > > the
>>>> > > > > > > > > > > > > > > > > > > > > > > >> testing data.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> https://github.com/apache/
>>>> > > > > > > > > > > > > > ignite/pull/1951/files#diff-
>>>> > > > > > > > > > > > > > > > > c19a9d
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>f4058141d059bb577e75244764
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> It means fields
>which
>>>> will be
>>>> > > > marked
>>>> > > > > > by
>>>> > > > > > > > > > > > > > > @BinaryCompression
>>>> > > > > > > > > > > > > > > > > > will
>>>> > > > > > > > > > > > > > > > > > > be
>>>> > > > > > > > > > > > > > > > > > > > > > > >> compressed at
>marshalling
>>>> via
>>>> > > > > > > > > > BinaryMarshaller.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> This solution has no
>>>> effect on
>>>> > > > > > existing
>>>> > > > > > > > data
>>>> > > > > > > > > > or
>>>> > > > > > > > > > > > > > project
>>>> > > > > > > > > > > > > > > > > > > > > architecture.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> I'll be glad to see
>your
>>>> > > thougths.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >> 2017-05-15 19:18
>GMT+03:00
>>>> > > > > Vyacheslav
>>>> > > > > > > > > Daradur
>>>> > > > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>> Dmitriy,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>> I have ready
>prototype. I
>>>> > want
>>>> > > to
>>>> > > > > > show
>>>> > > > > > > > it.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>> It is always easier
>to
>>>> > discuss
>>>> > > on
>>>> > > > > > > > example.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>> 2017-05-15 19:02
>>>> GMT+03:00
>>>> > > > Dmitriy
>>>> > > > > > > > > Setrakyan
>>>> > > > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> Vyacheslav,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> I think it is a
>bit
>>>> > premature
>>>> > > to
>>>> > > > > > > > provide a
>>>> > > > > > > > > > PR
>>>> > > > > > > > > > > > > > without
>>>> > > > > > > > > > > > > > > > > > getting
>>>> > > > > > > > > > > > > > > > > > > a
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> community
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> consensus on the
>dev
>>>> list.
>>>> > > > Please
>>>> > > > > > > allow
>>>> > > > > > > > > some
>>>> > > > > > > > > > > > time
>>>> > > > > > > > > > > > > > for
>>>> > > > > > > > > > > > > > > > the
>>>> > > > > > > > > > > > > > > > > > > > > community
>>>> > > > > > > > > > > > > > > > > > > > > > to
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> respond.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> D.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> On Mon, May 15,
>2017 at
>>>> 6:36
>>>> > > AM,
>>>> > > > > > > > > Vyacheslav
>>>> > > > > > > > > > > > > Daradur
>>>> > > > > > > > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>[hidden email]>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> wrote:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I created the
>ticket:
>>>> > > > > > > > > > > > > > > https://issues.apache.org/jira
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>/browse/IGNITE-5226
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > I'll prepare a
>PR with
>>>> > > > described
>>>> > > > > > > > > solution
>>>> > > > > > > > > > in
>>>> > > > > > > > > > > > > > couple
>>>> > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > days.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > 2017-05-15 15:05
>>>> GMT+03:00
>>>> > > > > > > Vyacheslav
>>>> > > > > > > > > > > Daradur
>>>> > > > > > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > > > >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Hi, Igniters!
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Apache 2.0 is
>>>> released.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Let's continue
>the
>>>> > > > discussion
>>>> > > > > > > about
>>>> > > > > > > > a
>>>> > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > > design.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > At the moment,
>I
>>>> found
>>>> > > only
>>>> > > > > one
>>>> > > > > > > > > solution
>>>> > > > > > > > > > > > which
>>>> > > > > > > > > > > > > > is
>>>> > > > > > > > > > > > > > > > > > > compatible
>>>> > > > > > > > > > > > > > > > > > > > > > with
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > querying
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > and indexing,
>this
>>>> is
>>>> > > > > > > > > per-objects-field
>>>> > > > > > > > > > > > > > > compression.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Per-fields
>>>> compression
>>>> > > means
>>>> > > > > > that
>>>> > > > > > > > > > metadata
>>>> > > > > > > > > > > > (a
>>>> > > > > > > > > > > > > > > > header)
>>>> > > > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > > an
>>>> > > > > > > > > > > > > > > > > > > > > > object
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> won't
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be compressed,
>only
>>>> > > > serialized
>>>> > > > > > > > values
>>>> > > > > > > > > of
>>>> > > > > > > > > > > an
>>>> > > > > > > > > > > > > > object
>>>> > > > > > > > > > > > > > > > > > fields
>>>> > > > > > > > > > > > > > > > > > > > (in
>>>> > > > > > > > > > > > > > > > > > > > > > > bytes
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> array
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > form) will be
>>>> > compressed.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > This solution
>have
>>>> some
>>>> > > > > > > contentious
>>>> > > > > > > > > > > issues:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - small
>values, like
>>>> > > > > primitives
>>>> > > > > > > and
>>>> > > > > > > > > > short
>>>> > > > > > > > > > > > > > arrays -
>>>> > > > > > > > > > > > > > > > > there
>>>> > > > > > > > > > > > > > > > > > > > isn't
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> sense to
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > compress them;
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > - there is no
>>>> possible
>>>> > to
>>>> > > > use
>>>> > > > > > > > > > compression
>>>> > > > > > > > > > > > with
>>>> > > > > > > > > > > > > > > > > > > > java-predefined
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> types;
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > We can provide
>an
>>>> > > > annotation,
>>>> > > > > > > > > > > > > > @IgniteCompression -
>>>> > > > > > > > > > > > > > > > for
>>>> > > > > > > > > > > > > > > > > > > > > example,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> which can
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > be used by
>users for
>>>> > > marking
>>>> > > > > > > fields
>>>> > > > > > > > to
>>>> > > > > > > > > > > > > compress.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Any thoughts?
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > Maybe someone
>>>> already
>>>> > have
>>>> > > > > ready
>>>> > > > > > > > > design?
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > 2017-04-10
>11:06
>>>> > GMT+03:00
>>>> > > > > > > > Vyacheslav
>>>> > > > > > > > > > > > Daradur
>>>> > > > > > > > > > > > > <
>>>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Alexey,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Yes, I've
>read it.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Ok, let's
>discuss
>>>> about
>>>> > > > > public
>>>> > > > > > > API
>>>> > > > > > > > > > > design.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> I think we
>need to
>>>> add
>>>> > > > some a
>>>> > > > > > > > > configure
>>>> > > > > > > > > > > > > entity
>>>> > > > > > > > > > > > > > to
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>>
>CacheConfiguration,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> which will
>contain
>>>> the
>>>> > > > > > Compressor
>>>> > > > > > > > > > > interface
>>>> > > > > > > > > > > > > > > > > > > implementation
>>>> > > > > > > > > > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > > > > > > > some
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > usefull
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> parameters.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Or maybe to
>>>> provide a
>>>> > > > > > > > > BinaryMarshaller
>>>> > > > > > > > > > > > > > decorator,
>>>> > > > > > > > > > > > > > > > > which
>>>> > > > > > > > > > > > > > > > > > > > will
>>>> > > > > > > > > > > > > > > > > > > > > be
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> compress
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> data after
>>>> marshalling.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> 2017-04-10
>10:40
>>>> > > GMT+03:00
>>>> > > > > > Alexey
>>>> > > > > > > > > > > > Kuznetsov <
>>>> > > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> >:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Vyacheslav,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Did you read
>>>> initial
>>>> > > > > > discussion
>>>> > > > > > > > [1]
>>>> > > > > > > > > > > about
>>>> > > > > > > > > > > > > > > > > compression?
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> As far as I
>>>> remember
>>>> > we
>>>> > > > > agreed
>>>> > > > > > > to
>>>> > > > > > > > > add
>>>> > > > > > > > > > > only
>>>> > > > > > > > > > > > > > some
>>>> > > > > > > > > > > > > > > > > > > > "top-level"
>>>> > > > > > > > > > > > > > > > > > > > > > API
>>>> > > > > > > > > > > > > > > > > > > > > > > in
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > order
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> to
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> provide a
>way for
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Ignite users
>to
>>>> inject
>>>> > > > some
>>>> > > > > > sort
>>>> > > > > > > > of
>>>> > > > > > > > > > > custom
>>>> > > > > > > > > > > > > > > > > > compression.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> [1]
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > http://apache-ignite-developer
>>>> > > > > > > > > > > > > > > s.2346864.n4.nabble
>>>> > > > > > > > > > > > > > > > .
>>>> > > > > > > > > > > > > > > > > > > > > com/Data-c
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > ompression-in-Ignite-2-0-
>>>> > > > > > > > > td10099.html
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> On Mon, Apr
>10,
>>>> 2017
>>>> > at
>>>> > > > 2:19
>>>> > > > > > PM,
>>>> > > > > > > > > > > > daradurvs <
>>>> > > > > > > > > > > > > > > > > > > > > > [hidden email]
>>>> > > > > > > > > > > > > > > > > > > > > > > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > wrote:
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Hi
>Igniters!
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I am
>interested
>>>> in
>>>> > > this
>>>> > > > > > task.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Provide
>some
>>>> kind of
>>>> > > > > > pluggable
>>>> > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > SPI
>>>> > > > > > > > > > > > > > > > > > support
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > <
>>>> > > > > https://issues.apache.org/
>>>> > > > > > > > > > > > > > > > > jira/browse/IGNITE-3592>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I
>developed a
>>>> > solution
>>>> > > > on
>>>> > > > > > > > > > > > > > > > BinaryMarshaller-level,
>>>> > > > > > > > > > > > > > > > > > but
>>>> > > > > > > > > > > > > > > > > > > > > > reviewer
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> has
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> rejected
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > it.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Let's
>continue
>>>> > > > discussion
>>>> > > > > of
>>>> > > > > > > > task
>>>> > > > > > > > > > > goals
>>>> > > > > > > > > > > > > and
>>>> > > > > > > > > > > > > > > > > solution
>>>> > > > > > > > > > > > > > > > > > > > > design.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > As I
>understood
>>>> > that,
>>>> > > > the
>>>> > > > > > main
>>>> > > > > > > > > goal
>>>> > > > > > > > > > of
>>>> > > > > > > > > > > > > this
>>>> > > > > > > > > > > > > > > task
>>>> > > > > > > > > > > > > > > > > is
>>>> > > > > > > > > > > > > > > > > > to
>>>> > > > > > > > > > > > > > > > > > > > > store
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> data in
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > compressed
>form.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > This is
>what I
>>>> need
>>>> > > from
>>>> > > > > > > Ignite
>>>> > > > > > > > as
>>>> > > > > > > > > > its
>>>> > > > > > > > > > > > > user.
>>>> > > > > > > > > > > > > > > > > > > Compression
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> provides
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> economy
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > on
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > servers.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > We can
>store
>>>> more
>>>> > data
>>>> > > > on
>>>> > > > > > same
>>>> > > > > > > > > > servers
>>>> > > > > > > > > > > > at
>>>> > > > > > > > > > > > > > the
>>>> > > > > > > > > > > > > > > > cost
>>>> > > > > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> increasing CPU
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>utilization.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > I'm
>researching
>>>> a
>>>> > > > > > possibility
>>>> > > > > > > of
>>>> > > > > > > > > > > > > > > implementation
>>>> > > > > > > > > > > > > > > > of
>>>> > > > > > > > > > > > > > > > > > > > > > compression
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> at the
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>cache-level.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Any
>thoughts?
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Best
>regards,
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Vyacheslav
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > --
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > View this
>>>> message in
>>>> > > > > > context:
>>>> > > > > > > > > > > > > > > > > http://apache-ignite-
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > developers.2346864.n4.nabble.
>>>> > > > > > > > > > > > > > > > > > com/Data-compression-in-
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > Ignite-2-0-tp10099p16317.html
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> > Sent from
>the
>>>> Apache
>>>> > > > > Ignite
>>>> > > > > > > > > > Developers
>>>> > > > > > > > > > > > > > mailing
>>>> > > > > > > > > > > > > > > > > list
>>>> > > > > > > > > > > > > > > > > > > > > archive
>>>> > > > > > > > > > > > > > > > > > > > > > at
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Nabble.com.
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> --
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>> Alexey
>Kuznetsov
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> --
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >> Best Regards,
>>>> > Vyacheslav
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >>
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > >
>>>> > > > > > > > > > > > > > > > > > > > > > > >>>> > > --
>>>> > > > > > > > > > > > > > > > > > > > > > >
>>>>
>>> ...
>>
>> [Message clipped]
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Alexey Kuznetsov
In reply to this post by Vladimir Ozerov
Vova,

Finally we back to my initial idea - to look how "big databases compress"
data :)


Just to remind how IBM DB2 do this[1].

[1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
1205db210compression/

On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Vyacheslav,
>
> This is not about my needs, but about the product :-) BinaryObject is a
> central entity used for both data transfer and data storage. This is both
> good and bad at the same time.
>
> Good thing is that as we optimize binary protocol, we improve both network
> and storage performance at the same time. We have at least 3 things which
> will be included into the product soon: varint encoding [1], optimized
> string encoding [2] and null-field optimization [3]. Bad thing is that
> binary object format is not well suited for data storage optimizations,
> including compression. For example, one good compression technique is to
> organize data in column-store format, or to introduce shared "dictionary"
> with unique values on cache level. In both cases N equal values are not
> stored N times. Instead, we store one value and N references to it, or so.
> This way 2x-10x compression is possible depending on workload type. Binary
> object protocol with some compression on top of it cannot give such
> improvement, because it will compress data in individual objects, instead
> of compressing the whole cache data in a single context.
>
> That said, I propose to give up adding compression to BinaryObject. This is
> a dead end. Instead, we should:
> 1) Optimize protocol itself to be more compact, as described in
> aforementioned Ignite tickets
> 2) Start new discussion about storage compression
>
> You can read papers of other vendors to get better understanding on
> possible compression options. E.g. Oracle has a lot of compression
> techniques, including heat maps, background compression, per-block
> compression, data dictionaries, etc. [4].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5097
> [2] https://issues.apache.org/jira/browse/IGNITE-5655
> [3] https://issues.apache.org/jira/browse/IGNITE-3939
> [4] http://www.oracle.com/technetwork/database/options/
> compression/advanced-
> compression-wp-12c-1896128.pdf
>
> Vladimir.
>
>

--
Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Vladimir, thank you for detailed explanation.

I think I've understanded the main idea of described storage compression.

I'll join the new discussion after researching of given material and
comlpetion of varint-optimization [1].

[1] https://issues.apache.org/jira/browse/IGNITE-5097

2017-08-02 15:43 GMT+03:00 Alexey Kuznetsov <[hidden email]>:

> Vova,
>
> Finally we back to my initial idea - to look how "big databases compress"
> data :)
>
>
> Just to remind how IBM DB2 do this[1].
>
> [1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
> 1205db210compression/
>
> On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Vyacheslav,
> >
> > This is not about my needs, but about the product :-) BinaryObject is a
> > central entity used for both data transfer and data storage. This is both
> > good and bad at the same time.
> >
> > Good thing is that as we optimize binary protocol, we improve both
> network
> > and storage performance at the same time. We have at least 3 things which
> > will be included into the product soon: varint encoding [1], optimized
> > string encoding [2] and null-field optimization [3]. Bad thing is that
> > binary object format is not well suited for data storage optimizations,
> > including compression. For example, one good compression technique is to
> > organize data in column-store format, or to introduce shared "dictionary"
> > with unique values on cache level. In both cases N equal values are not
> > stored N times. Instead, we store one value and N references to it, or
> so.
> > This way 2x-10x compression is possible depending on workload type.
> Binary
> > object protocol with some compression on top of it cannot give such
> > improvement, because it will compress data in individual objects, instead
> > of compressing the whole cache data in a single context.
> >
> > That said, I propose to give up adding compression to BinaryObject. This
> is
> > a dead end. Instead, we should:
> > 1) Optimize protocol itself to be more compact, as described in
> > aforementioned Ignite tickets
> > 2) Start new discussion about storage compression
> >
> > You can read papers of other vendors to get better understanding on
> > possible compression options. E.g. Oracle has a lot of compression
> > techniques, including heat maps, background compression, per-block
> > compression, data dictionaries, etc. [4].
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-5097
> > [2] https://issues.apache.org/jira/browse/IGNITE-5655
> > [3] https://issues.apache.org/jira/browse/IGNITE-3939
> > [4] http://www.oracle.com/technetwork/database/options/
> > compression/advanced-
> > compression-wp-12c-1896128.pdf
> >
> > Vladimir.
> >
> >
>
> --
> Alexey Kuznetsov
>



--
Best Regards, Vyacheslav D.
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

daradurvs
Hi, should I close the initial ticket [1] as "Won't Fix" and add link to
the new discusion about storage compression [2] in comments?

[1] https://issues.apache.org/jira/browse/IGNITE-3592
[2]
http://apache-ignite-developers.2346864.n4.nabble.com/Data-compression-in-Ignite-td20679.html

2017-08-09 23:05 GMT+03:00 Vyacheslav Daradur <[hidden email]>:

> Vladimir, thank you for detailed explanation.
>
> I think I've understanded the main idea of described storage compression.
>
> I'll join the new discussion after researching of given material and
> comlpetion of varint-optimization [1].
>
> [1] https://issues.apache.org/jira/browse/IGNITE-5097
>
> 2017-08-02 15:43 GMT+03:00 Alexey Kuznetsov <[hidden email]>:
>
>> Vova,
>>
>> Finally we back to my initial idea - to look how "big databases compress"
>> data :)
>>
>>
>> Just to remind how IBM DB2 do this[1].
>>
>> [1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
>> 1205db210compression/
>> <http://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/>
>>
>> On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <[hidden email]>
>> wrote:
>>
>> > Vyacheslav,
>> >
>> > This is not about my needs, but about the product :-) BinaryObject is a
>> > central entity used for both data transfer and data storage. This is
>> both
>> > good and bad at the same time.
>> >
>> > Good thing is that as we optimize binary protocol, we improve both
>> network
>> > and storage performance at the same time. We have at least 3 things
>> which
>> > will be included into the product soon: varint encoding [1], optimized
>> > string encoding [2] and null-field optimization [3]. Bad thing is that
>> > binary object format is not well suited for data storage optimizations,
>> > including compression. For example, one good compression technique is to
>> > organize data in column-store format, or to introduce shared
>> "dictionary"
>> > with unique values on cache level. In both cases N equal values are not
>> > stored N times. Instead, we store one value and N references to it, or
>> so.
>> > This way 2x-10x compression is possible depending on workload type.
>> Binary
>> > object protocol with some compression on top of it cannot give such
>> > improvement, because it will compress data in individual objects,
>> instead
>> > of compressing the whole cache data in a single context.
>> >
>> > That said, I propose to give up adding compression to BinaryObject.
>> This is
>> > a dead end. Instead, we should:
>> > 1) Optimize protocol itself to be more compact, as described in
>> > aforementioned Ignite tickets
>> > 2) Start new discussion about storage compression
>> >
>> > You can read papers of other vendors to get better understanding on
>> > possible compression options. E.g. Oracle has a lot of compression
>> > techniques, including heat maps, background compression, per-block
>> > compression, data dictionaries, etc. [4].
>> >
>> > [1] https://issues.apache.org/jira/browse/IGNITE-5097
>> > [2] https://issues.apache.org/jira/browse/IGNITE-5655
>> > [3] https://issues.apache.org/jira/browse/IGNITE-3939
>> > [4] http://www.oracle.com/technetwork/database/options/
>> > compression/advanced-
>> > compression-wp-12c-1896128.pdf
>> >
>> > Vladimir.
>> >
>> >
>>
>> --
>> Alexey Kuznetsov
>>
>
>
>
> --
> Best Regards, Vyacheslav D.
>



--
Best Regards, Vyacheslav D.
Reply | Threaded
Open this post in threaded view
|

Re: Data compression in Ignite 2.0

Vladimir Ozerov
Hi Vyacheslav,

Yes, I would suggest you to do so.

On Fri, Aug 25, 2017 at 2:51 PM, Vyacheslav Daradur <[hidden email]>
wrote:

> Hi, should I close the initial ticket [1] as "Won't Fix" and add link to
> the new discusion about storage compression [2] in comments?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-3592
> [2] http://apache-ignite-developers.2346864.n4.nabble.
> com/Data-compression-in-Ignite-td20679.html
>
> 2017-08-09 23:05 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
>
>> Vladimir, thank you for detailed explanation.
>>
>> I think I've understanded the main idea of described storage compression.
>>
>> I'll join the new discussion after researching of given material and
>> comlpetion of varint-optimization [1].
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-5097
>>
>> 2017-08-02 15:43 GMT+03:00 Alexey Kuznetsov <[hidden email]>:
>>
>>> Vova,
>>>
>>> Finally we back to my initial idea - to look how "big databases compress"
>>> data :)
>>>
>>>
>>> Just to remind how IBM DB2 do this[1].
>>>
>>> [1] http://www.ibm.com/developerworks/data/library/techarticle/dm-
>>> 1205db210compression/
>>> <http://www.ibm.com/developerworks/data/library/techarticle/dm-1205db210compression/>
>>>
>>> On Tue, Aug 1, 2017 at 4:15 PM, Vladimir Ozerov <[hidden email]>
>>> wrote:
>>>
>>> > Vyacheslav,
>>> >
>>> > This is not about my needs, but about the product :-) BinaryObject is a
>>> > central entity used for both data transfer and data storage. This is
>>> both
>>> > good and bad at the same time.
>>> >
>>> > Good thing is that as we optimize binary protocol, we improve both
>>> network
>>> > and storage performance at the same time. We have at least 3 things
>>> which
>>> > will be included into the product soon: varint encoding [1], optimized
>>> > string encoding [2] and null-field optimization [3]. Bad thing is that
>>> > binary object format is not well suited for data storage optimizations,
>>> > including compression. For example, one good compression technique is
>>> to
>>> > organize data in column-store format, or to introduce shared
>>> "dictionary"
>>> > with unique values on cache level. In both cases N equal values are not
>>> > stored N times. Instead, we store one value and N references to it, or
>>> so.
>>> > This way 2x-10x compression is possible depending on workload type.
>>> Binary
>>> > object protocol with some compression on top of it cannot give such
>>> > improvement, because it will compress data in individual objects,
>>> instead
>>> > of compressing the whole cache data in a single context.
>>> >
>>> > That said, I propose to give up adding compression to BinaryObject.
>>> This is
>>> > a dead end. Instead, we should:
>>> > 1) Optimize protocol itself to be more compact, as described in
>>> > aforementioned Ignite tickets
>>> > 2) Start new discussion about storage compression
>>> >
>>> > You can read papers of other vendors to get better understanding on
>>> > possible compression options. E.g. Oracle has a lot of compression
>>> > techniques, including heat maps, background compression, per-block
>>> > compression, data dictionaries, etc. [4].
>>> >
>>> > [1] https://issues.apache.org/jira/browse/IGNITE-5097
>>> > [2] https://issues.apache.org/jira/browse/IGNITE-5655
>>> > [3] https://issues.apache.org/jira/browse/IGNITE-3939
>>> > [4] http://www.oracle.com/technetwork/database/options/
>>> > compression/advanced-
>>> > compression-wp-12c-1896128.pdf
>>> >
>>> > Vladimir.
>>> >
>>> >
>>>
>>> --
>>> Alexey Kuznetsov
>>>
>>
>>
>>
>> --
>> Best Regards, Vyacheslav D.
>>
>
>
>
> --
> Best Regards, Vyacheslav D.
>
123