Hello everyone.
https://issues.apache.org/jira/browse/IGNITE-3592 is ready for review. ci.tests [1] look good. Upsource review was created.[2] Solution provides 2 way to use compression. 1. Full compression mode. Switches in the IgniteConfiguration#fullCompressionMode field. Full compression means all data (metadata+value) of whole object will be compress at the end of marshalling. 2. Annotated fields compression. Values of fields annotated by @BinaryCompression annotation will be compress at marshalling. Only value, header of whole object will not be compress. There is an opportunity to use user compressor. It configures in the IgniteConfiguration#compressor field. There is the Compressor interface which user can implement. Default implementation is using GZIP lib. There is one more implementation which use ZLIB compression library (Deflater). Also there is the more flexible version [3] with parameterized annotations, when we have several compressors and can choose them at runtime. But such freedom can provides many problems. Feel free to contact me for fixes and improvements. [1] http://ci.ignite.apache.org/viewLog.html?buildId=520503 [2] http://reviews.ignite.apache.org/ignite/review/IGNT-CR-143 [3] https://github.com/apache/ignite/pull/1650/commits/c7cf273d46bfe52fa57f1e296f3e649012b18572 -- Best Regards, Vyacheslav |
Hello Vyacheslav, thanks for your efforts.
Is there any special support for SQL? In the original discussion [1] around this task the folks expressed several concerns about compression usefulness w/o the ability to execute SQL queries over compressed data. In general it makes sense to run the discussion in [1]. [1] http://apache-ignite-developers.2346864.n4.nabble.com/Data-compression-in-Ignite-2-0-td10099.html <http://apache-ignite-developers.2346864.n4.nabble.com/Data-compression-in-Ignite-2-0-td10099.html> — Denis > On Mar 29, 2017, at 1:32 AM, Vyacheslav Daradur <[hidden email]> wrote: > > Hello everyone. > > https://issues.apache.org/jira/browse/IGNITE-3592 is ready for review. > ci.tests [1] look good. > Upsource review was created.[2] > > Solution provides 2 way to use compression. > 1. Full compression mode. > Switches in the IgniteConfiguration#fullCompressionMode field. > Full compression means all data (metadata+value) of whole object will be > compress at the end of marshalling. > > 2. Annotated fields compression. > Values of fields annotated by @BinaryCompression annotation will be > compress at marshalling. > Only value, header of whole object will not be compress. > > There is an opportunity to use user compressor. > It configures in the IgniteConfiguration#compressor field. > There is the Compressor interface which user can implement. > > Default implementation is using GZIP lib. > There is one more implementation which use ZLIB compression library > (Deflater). > > Also there is the more flexible version [3] with parameterized annotations, > when we have several compressors and can choose them at runtime. > But such freedom can provides many problems. > > Feel free to contact me for fixes and improvements. > > [1] http://ci.ignite.apache.org/viewLog.html?buildId=520503 > [2] http://reviews.ignite.apache.org/ignite/review/IGNT-CR-143 > [3] > https://github.com/apache/ignite/pull/1650/commits/c7cf273d46bfe52fa57f1e296f3e649012b18572 > > -- > Best Regards, Vyacheslav |
On Wed, Mar 29, 2017 at 10:56 AM, Denis Magda <[hidden email]> wrote:
> Hello Vyacheslav, thanks for your efforts. > > Is there any special support for SQL? In the original discussion [1] > around this task the folks expressed several concerns about compression > usefulness w/o the ability to execute SQL queries over compressed data. In > general it makes sense to run the discussion in [1]. > > [1] http://apache-ignite-developers.2346864.n4.nabble. > com/Data-compression-in-Ignite-2-0-td10099.html <http://apache-ignite- > developers.2346864.n4.nabble.com/Data-compression-in- > Ignite-2-0-td10099.html> > I second Denis' concern. Compression is very useful, but we must be able to use SQL queries on the compressed data. One way would be to make sure that the indexes are not compressed, which would mean that we have to decompress the value to update indexes, or do the reverse - compress it only after we update the indexes. Sergi, what do you think? |
Solution implemented in core-level and works with binary-marshaller.
If you about the cache queries - it works with compressed data. 2017-03-29 21:42 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Wed, Mar 29, 2017 at 10:56 AM, Denis Magda <[hidden email]> wrote: > > > Hello Vyacheslav, thanks for your efforts. > > > > Is there any special support for SQL? In the original discussion [1] > > around this task the folks expressed several concerns about compression > > usefulness w/o the ability to execute SQL queries over compressed data. > In > > general it makes sense to run the discussion in [1]. > > > > [1] http://apache-ignite-developers.2346864.n4.nabble. > > com/Data-compression-in-Ignite-2-0-td10099.html <http://apache-ignite- > > developers.2346864.n4.nabble.com/Data-compression-in- > > Ignite-2-0-td10099.html> > > > > I second Denis' concern. Compression is very useful, but we must be able to > use SQL queries on the compressed data. One way would be to make sure that > the indexes are not compressed, which would mean that we have to decompress > the value to update indexes, or do the reverse - compress it only after we > update the indexes. > > Sergi, what do you think? > -- Best Regards, Vyacheslav |
On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur <[hidden email]>
wrote: > Solution implemented in core-level and works with binary-marshaller. > > If you about the cache queries - it works with compressed data. > Vyacheslav, can you please explain how the cache queries work with the compressed data? |
Queries works with BinaryObjectImpl.
1. In the full compression mode - compressed bytes sequence - will be decompressed at initialization of BinaryObjectImpl. 2. With annotated fields compression - value of compressed fields will be decompress at deserializing on demand, for example when calls methods BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur <[hidden email]> > wrote: > > > Solution implemented in core-level and works with binary-marshaller. > > > > If you about the cache queries - it works with compressed data. > > > > Vyacheslav, can you please explain how the cache queries work with the > compressed data? > -- Best Regards, Vyacheslav |
On Wed, Mar 29, 2017 at 11:57 AM, Vyacheslav Daradur <[hidden email]>
wrote: > Queries works with BinaryObjectImpl. > > 1. In the full compression mode - compressed bytes sequence - will be > decompressed at initialization of BinaryObjectImpl. > At which point does this step take place? Do we deserialize right when we receive the object over the wire? > 2. With annotated fields compression - value of compressed fields will be > decompress at deserializing on demand, for example when calls methods > BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder > Forgive me if I don't know the internals, but does this happen when SQL queries are executed? > > 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur < > [hidden email]> > > wrote: > > > > > Solution implemented in core-level and works with binary-marshaller. > > > > > > If you about the cache queries - it works with compressed data. > > > > > > > Vyacheslav, can you please explain how the cache queries work with the > > compressed data? > > > > > > -- > Best Regards, Vyacheslav > |
> At which point does this step take place? Do we deserialize right when we
> receive the object over the wire? When put it in cache, after marshalling. Covered by properly configured existing tests. [1][2][3] > Forgive me if I don't know the internals, but does this happen when SQL > queries are executed? Yes. Covered by tests[4] [1] https://github.com/apache/ignite/pull/1650/files#diff-af9e2960a6c9f73fa56a5b3824b6b397 [2] https://github.com/apache/ignite/pull/1650/files#diff-ed2aa7d56ed004ae9bc975edc9b8a9c2 ]3] https://github.com/apache/ignite/pull/1650/files#diff-a4b76c24a5f9bc9e78d7cff0a7645328 [4] https://github.com/apache/ignite/pull/1650/files#diff-c19a9df4058141d059bb577e75244764 2017-03-29 23:16 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Wed, Mar 29, 2017 at 11:57 AM, Vyacheslav Daradur <[hidden email]> > wrote: > > > Queries works with BinaryObjectImpl. > > > > 1. In the full compression mode - compressed bytes sequence - will be > > decompressed at initialization of BinaryObjectImpl. > > > > At which point does this step take place? Do we deserialize right when we > receive the object over the wire? > > > > 2. With annotated fields compression - value of compressed fields will be > > decompress at deserializing on demand, for example when calls methods > > BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder > > > > Forgive me if I don't know the internals, but does this happen when SQL > queries are executed? > > > > > > 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > > > On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur < > > [hidden email]> > > > wrote: > > > > > > > Solution implemented in core-level and works with binary-marshaller. > > > > > > > > If you about the cache queries - it works with compressed data. > > > > > > > > > > Vyacheslav, can you please explain how the cache queries work with the > > > compressed data? > > > > > > > > > > > -- > > Best Regards, Vyacheslav > > > -- Best Regards, Vyacheslav |
Sergi, Vovan,
As SQL and binary marshaller maintainers could you plan to review the contribution? — Denis > On Mar 29, 2017, at 4:44 PM, Vyacheslav Daradur <[hidden email]> wrote: > >> At which point does this step take place? Do we deserialize right when we >> receive the object over the wire? > When put it in cache, after marshalling. > Covered by properly configured existing tests. [1][2][3] > > >> Forgive me if I don't know the internals, but does this happen when SQL >> queries are executed? > Yes. Covered by tests[4] > > [1] > https://github.com/apache/ignite/pull/1650/files#diff-af9e2960a6c9f73fa56a5b3824b6b397 > [2] > https://github.com/apache/ignite/pull/1650/files#diff-ed2aa7d56ed004ae9bc975edc9b8a9c2 > ]3] > https://github.com/apache/ignite/pull/1650/files#diff-a4b76c24a5f9bc9e78d7cff0a7645328 > [4] > https://github.com/apache/ignite/pull/1650/files#diff-c19a9df4058141d059bb577e75244764 > > 2017-03-29 23:16 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > >> On Wed, Mar 29, 2017 at 11:57 AM, Vyacheslav Daradur <[hidden email]> >> wrote: >> >>> Queries works with BinaryObjectImpl. >>> >>> 1. In the full compression mode - compressed bytes sequence - will be >>> decompressed at initialization of BinaryObjectImpl. >>> >> >> At which point does this step take place? Do we deserialize right when we >> receive the object over the wire? >> >> >>> 2. With annotated fields compression - value of compressed fields will be >>> decompress at deserializing on demand, for example when calls methods >>> BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder >>> >> >> Forgive me if I don't know the internals, but does this happen when SQL >> queries are executed? >> >> >>> >>> 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: >>> >>>> On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur < >>> [hidden email]> >>>> wrote: >>>> >>>>> Solution implemented in core-level and works with binary-marshaller. >>>>> >>>>> If you about the cache queries - it works with compressed data. >>>>> >>>> >>>> Vyacheslav, can you please explain how the cache queries work with the >>>> compressed data? >>>> >>> >>> >>> >>> -- >>> Best Regards, Vyacheslav >>> >> > > > > -- > Best Regards, Vyacheslav |
Hello everyone.
I have prepared the raw evaluation (marshaller layer) of compression. (GZIP implementation) https://github.com/daradurvs/ignite-compression/tree/master/src/main/resources/result In plans for evaluation: 1. Deflater compressor implementation; 2. Snappy compressor implementation; 3. CPU load estimation with JMH; 4. Tests with different data. 2017-03-30 7:13 GMT+03:00 Denis Magda <[hidden email]>: > Sergi, Vovan, > > As SQL and binary marshaller maintainers could you plan to review the > contribution? > > — > Denis > > > On Mar 29, 2017, at 4:44 PM, Vyacheslav Daradur <[hidden email]> > wrote: > > > >> At which point does this step take place? Do we deserialize right when > we > >> receive the object over the wire? > > When put it in cache, after marshalling. > > Covered by properly configured existing tests. [1][2][3] > > > > > >> Forgive me if I don't know the internals, but does this happen when SQL > >> queries are executed? > > Yes. Covered by tests[4] > > > > [1] > > https://github.com/apache/ignite/pull/1650/files#diff- > af9e2960a6c9f73fa56a5b3824b6b397 > > [2] > > https://github.com/apache/ignite/pull/1650/files#diff- > ed2aa7d56ed004ae9bc975edc9b8a9c2 > > ]3] > > https://github.com/apache/ignite/pull/1650/files#diff- > a4b76c24a5f9bc9e78d7cff0a7645328 > > [4] > > https://github.com/apache/ignite/pull/1650/files#diff- > c19a9df4058141d059bb577e75244764 > > > > 2017-03-29 23:16 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > >> On Wed, Mar 29, 2017 at 11:57 AM, Vyacheslav Daradur < > [hidden email]> > >> wrote: > >> > >>> Queries works with BinaryObjectImpl. > >>> > >>> 1. In the full compression mode - compressed bytes sequence - will be > >>> decompressed at initialization of BinaryObjectImpl. > >>> > >> > >> At which point does this step take place? Do we deserialize right when > we > >> receive the object over the wire? > >> > >> > >>> 2. With annotated fields compression - value of compressed fields will > be > >>> decompress at deserializing on demand, for example when calls methods > >>> BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder > >>> > >> > >> Forgive me if I don't know the internals, but does this happen when SQL > >> queries are executed? > >> > >> > >>> > >>> 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > >>> > >>>> On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur < > >>> [hidden email]> > >>>> wrote: > >>>> > >>>>> Solution implemented in core-level and works with binary-marshaller. > >>>>> > >>>>> If you about the cache queries - it works with compressed data. > >>>>> > >>>> > >>>> Vyacheslav, can you please explain how the cache queries work with the > >>>> compressed data? > >>>> > >>> > >>> > >>> > >>> -- > >>> Best Regards, Vyacheslav > >>> > >> > > > > > > > > -- > > Best Regards, Vyacheslav > > -- Best Regards, Vyacheslav |
Guys, any thoughts?
I'm ready to improve quickly the solution according to yours notes to include this in the Ignite-2.0. 2017-03-31 12:30 GMT+03:00 Vyacheslav Daradur <[hidden email]>: > Hello everyone. > > I have prepared the raw evaluation (marshaller layer) of compression. > (GZIP implementation) > https://github.com/daradurvs/ignite-compression/tree/ > master/src/main/resources/result > > In plans for evaluation: > 1. Deflater compressor implementation; > 2. Snappy compressor implementation; > 3. CPU load estimation with JMH; > 4. Tests with different data. > > > > > 2017-03-30 7:13 GMT+03:00 Denis Magda <[hidden email]>: > >> Sergi, Vovan, >> >> As SQL and binary marshaller maintainers could you plan to review the >> contribution? >> >> — >> Denis >> >> > On Mar 29, 2017, at 4:44 PM, Vyacheslav Daradur <[hidden email]> >> wrote: >> > >> >> At which point does this step take place? Do we deserialize right when >> we >> >> receive the object over the wire? >> > When put it in cache, after marshalling. >> > Covered by properly configured existing tests. [1][2][3] >> > >> > >> >> Forgive me if I don't know the internals, but does this happen when SQL >> >> queries are executed? >> > Yes. Covered by tests[4] >> > >> > [1] >> > https://github.com/apache/ignite/pull/1650/files#diff-af9e29 >> 60a6c9f73fa56a5b3824b6b397 >> > [2] >> > https://github.com/apache/ignite/pull/1650/files#diff-ed2aa7 >> d56ed004ae9bc975edc9b8a9c2 >> > ]3] >> > https://github.com/apache/ignite/pull/1650/files#diff-a4b76c >> 24a5f9bc9e78d7cff0a7645328 >> > [4] >> > https://github.com/apache/ignite/pull/1650/files#diff-c19a9d >> f4058141d059bb577e75244764 >> > >> > 2017-03-29 23:16 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: >> > >> >> On Wed, Mar 29, 2017 at 11:57 AM, Vyacheslav Daradur < >> [hidden email]> >> >> wrote: >> >> >> >>> Queries works with BinaryObjectImpl. >> >>> >> >>> 1. In the full compression mode - compressed bytes sequence - will be >> >>> decompressed at initialization of BinaryObjectImpl. >> >>> >> >> >> >> At which point does this step take place? Do we deserialize right when >> we >> >> receive the object over the wire? >> >> >> >> >> >>> 2. With annotated fields compression - value of compressed fields >> will be >> >>> decompress at deserializing on demand, for example when calls methods >> >>> BinaryObjectImpl#field and BinaryObjectImpl#fieldByOrder >> >>> >> >> >> >> Forgive me if I don't know the internals, but does this happen when SQL >> >> queries are executed? >> >> >> >> >> >>> >> >>> 2017-03-29 21:47 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: >> >>> >> >>>> On Wed, Mar 29, 2017 at 11:44 AM, Vyacheslav Daradur < >> >>> [hidden email]> >> >>>> wrote: >> >>>> >> >>>>> Solution implemented in core-level and works with binary-marshaller. >> >>>>> >> >>>>> If you about the cache queries - it works with compressed data. >> >>>>> >> >>>> >> >>>> Vyacheslav, can you please explain how the cache queries work with >> the >> >>>> compressed data? >> >>>> >> >>> >> >>> >> >>> >> >>> -- >> >>> Best Regards, Vyacheslav >> >>> >> >> >> > >> > >> > >> > -- >> > Best Regards, Vyacheslav >> >> > > > -- > Best Regards, Vyacheslav > -- Best Regards, Vyacheslav |
Free forum by Nabble | Edit this page |