Apache Ignite Developers - Legacy Mail Archive

Data compression in Ignite

Classic

List

Threaded

15 messages Options

Vladimir Ozerov

Data compression in Ignite

Igniters,

I had several private talks with Igniters about data compression and would
like to share the summary with ... Igniters :-)

Currently all Ignite's data is uncompressed. It leads to excessive network
traffic, GC pressure and disk IO (in case of persistence). Most modern
databases are able to compress data, what gives them 2-4x size reduction on
typical workloads. We need compression in Ignite.

There are several options I'd like to discuss. The main difference between
them - on what "level" to compress: per-entry, per-data-page or per-cache.

*1) Per-entry compression*
Apache Geode uses this approach. Every cache entry is compressed using
Snappy. This is very easy to implement, but every entry access (e.g.
reading single field) require full decompression or even re-compression,
what could lead to higher CPU consumption and worse performance.

*2) Per-data-page compression*
Oracle and DB2 use this approach. Pages are compressed with
dictionary-based approach (e.g. LZV). It is important, that they do not
compress the whole page. Instead, only actual data is compressed, while
page structure remains intact. Dictionary is placed within the page. This
way it is possible to work with individual entries and even individual
fields without full page decompression. Another important thing - it is not
necessary to re-compress the page on each write. Instead, data is stored in
uncompressed form first, and compressed even after certain threshold is
reached. So negative CPU impact is minimal. Typical compression rate would
be higher than in per-entry case, because the more data you have, the
better it can be compressed.

*3) Per-cache compression*
Suggested by Alex Goncharuk. We could have a dictionary for the whole
cache. This way we could achieve the highest compression rate possible. The
downside is complex implementation - we would have to develop an algorithm
of sharing the dictionary within the cluster. At some point the dictionary
could become too huge to fit in-memory, so we should either control it's
size or spill it to disk.

I propose to use per-data-page approach as both gives nice compression rate
and relatively easy to implement.

Please share your thoughts.

Vladimir.

yzhdanov

Re: Data compression in Ignite

Agree, I like 2 best of all.

--Yakov

dsetrakyan

Re: Data compression in Ignite

In reply to this post by Vladimir Ozerov

Vova,

I do not understand how (2) will help us reduce the size of the serialized
object. It sounds like we need to serialize an object the way we do today,
and then compress it when it arrives to the server side and gets stored in
some page. This looks like a double whammy.

I do not understand why the approach (3) is difficult. Yes, we will have to
implement a shared compression table, but I am not sure if it will ever
grow too big. As a matter of fact, we can stop adding entries to it, after
it reaches a certain pre-defined size. Also, we already know how to persist
data, so spilling the compression table to disk should not be an issue.

If my assumption about (2) is correct, I would like to ask you to give
another thought to the (3).

D.

On Wed, Aug 9, 2017 at 7:48 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Igniters,
>
> I had several private talks with Igniters about data compression and would
> like to share the summary with ... Igniters :-)
>
> Currently all Ignite's data is uncompressed. It leads to excessive network
> traffic, GC pressure and disk IO (in case of persistence). Most modern
> databases are able to compress data, what gives them 2-4x size reduction on
> typical workloads. We need compression in Ignite.
>
> There are several options I'd like to discuss. The main difference between
> them - on what "level" to compress: per-entry, per-data-page or per-cache.
>
> *1) Per-entry compression*
> Apache Geode uses this approach. Every cache entry is compressed using
> Snappy. This is very easy to implement, but every entry access (e.g.
> reading single field) require full decompression or even re-compression,
> what could lead to higher CPU consumption and worse performance.
>
> *2) Per-data-page compression*
> Oracle and DB2 use this approach. Pages are compressed with
> dictionary-based approach (e.g. LZV). It is important, that they do not
> compress the whole page. Instead, only actual data is compressed, while
> page structure remains intact. Dictionary is placed within the page. This
> way it is possible to work with individual entries and even individual
> fields without full page decompression. Another important thing - it is not
> necessary to re-compress the page on each write. Instead, data is stored in
> uncompressed form first, and compressed even after certain threshold is
> reached. So negative CPU impact is minimal. Typical compression rate would
> be higher than in per-entry case, because the more data you have, the
> better it can be compressed.
>
> *3) Per-cache compression*
> Suggested by Alex Goncharuk. We could have a dictionary for the whole
> cache. This way we could achieve the highest compression rate possible. The
> downside is complex implementation - we would have to develop an algorithm
> of sharing the dictionary within the cluster. At some point the dictionary
> could become too huge to fit in-memory, so we should either control it's
> size or spill it to disk.
>
> I propose to use per-data-page approach as both gives nice compression rate
> and relatively easy to implement.
>
> Please share your thoughts.
>
> Vladimir.
>

Ivan Rakov

Re: Data compression in Ignite

In reply to this post by Vladimir Ozerov

+1 for per-page approach (2).
I'm not sure if dictionary-based encoding within page is really
efficient. Compression quality grows with increasing size of sample
(it's pageSize in our case - not so big), and we can't encode "words" on
the border of two entries. Also, we'll have to "garbage-collect" page
dictionary from time to time (as set of entries may change and some
"words" may become redundant).
However, if Oracle implemented it in profitable way, we can do it as well :)

I have a concern about (3) that it will add significant latency/network
overhead. Accessing distributed caсhe with dictionary at every put is
not free.

Also, I'd like to propose one more approach (let it be (4)) - local
per-partition dictionary. We can store set of dictionary "words" in B+
tree or any other structure in pages - it will be automatically
persisted. Efficiency of encoding will be higher than in per-page case.
Thoughts?
The open question for per-partition dictionary is how to
"garbage-collect" it. In per-page case we can do it under page write
lock, but here we have to do it in more tricky way.

Best Regards,
Ivan Rakov

On 09.08.2017 17:48, Vladimir Ozerov wrote:

yzhdanov

Re: Data compression in Ignite

Ivan, your points definitely make sense, however, I see at list one issue
compared to per-page approach. With per-page we can split page to
compressed and uncompressed regions and change the dictionary if more
efficient compression is possible. With bigger areas such as partition or
entire cache dynamic dictionary change may be very complex.

--Yakov

dsetrakyan

Re: Data compression in Ignite

I still don't understand per-partition compression, which is only local.
How do entries get updated in the middle of a partition file?

D.

On Fri, Aug 11, 2017 at 3:44 AM, Yakov Zhdanov <[hidden email]> wrote:

> Ivan, your points definitely make sense, however, I see at list one issue
> compared to per-page approach. With per-page we can split page to
> compressed and uncompressed regions and change the dictionary if more
> efficient compression is possible. With bigger areas such as partition or
> entire cache dynamic dictionary change may be very complex.
>
> --Yakov
>

Vladimir Ozerov

Re: Data compression in Ignite

Igniters,

This is to inform you that we had another conversation about compression
design, where Ivan's proposal about per-partition compression was
discussed. Let me share our current vision.

1) Per-partition approach will have higher compression rate, than per-page
approach. This is beyond doubts.
2) But it is much more difficult to implement comparing to per-page. In
particular, dictionary management. Moreover, write operations might suffer.

As a result we think that per-page approach is the way to go as a first
iteration. In future we may consider per-partition compression as a kind of
"one-time" batch operation which could be performed on rarely updated and
historical data. Tiered approach. Mature commercial vendors work in exactly
the same way. Sergey Puchnin and I are trying to better understand on how
exactly compression could be implemented.

What we understand now:
1) This will be dictionary-based compression (e.g. LZV)
2) Page will be compressed in batch mode. I.e. not on every change, but
when certain threshold is reached (e.g. page's free space drops below 20%)

What we do not understand yet:
1) Granularity of compression algorithm.
1.1) It could be per-entry - i.e. we compress the whole entry content, but
respect boundaries between entries. E.g.: before - [ENTRY_1][ENTRY_2],
after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to [COMPRESSED
ENTRY_1 and ENTRY_2]).
1.2) Or it could be per-field - i.e. we compress fields, but respect binary
object layout. First approach is simple, straightforward, and will give
acceptable compression rate, but we will have to compress the whole binary
object on every field access, what may ruin our SQL performance. Second
approach is more complex, we are not sure about it's compression rate, but
as BinaryObject structure is preserved, we will still have fast
constant-time per-field access.

Please share your thoughts.

Vladimir.

On Sat, Aug 12, 2017 at 4:08 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> I still don't understand per-partition compression, which is only local.
> How do entries get updated in the middle of a partition file?
>
> D.
>
> On Fri, Aug 11, 2017 at 3:44 AM, Yakov Zhdanov <[hidden email]>
> wrote:
>
> > Ivan, your points definitely make sense, however, I see at list one issue
> > compared to per-page approach. With per-page we can split page to
> > compressed and uncompressed regions and change the dictionary if more
> > efficient compression is possible. With bigger areas such as partition or
> > entire cache dynamic dictionary change may be very complex.
> >
> > --Yakov
> >
>

dsetrakyan

Re: Data compression in Ignite

Some more comments:

1. I see advantage in both approaches, any chance we can support both of
them?
2. We have to make sure that it is possible to enable different types of
compression on per-cache level.
3. Would also be nice to show the compression ratio, i.e. how much data
space was saved, as part of monitoring.

D.

On Wed, Aug 23, 2017 at 6:12 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Igniters,
>
> This is to inform you that we had another conversation about compression
> design, where Ivan's proposal about per-partition compression was
> discussed. Let me share our current vision.
>
> 1) Per-partition approach will have higher compression rate, than per-page
> approach. This is beyond doubts.
> 2) But it is much more difficult to implement comparing to per-page. In
> particular, dictionary management. Moreover, write operations might suffer.
>
> As a result we think that per-page approach is the way to go as a first
> iteration. In future we may consider per-partition compression as a kind of
> "one-time" batch operation which could be performed on rarely updated and
> historical data. Tiered approach. Mature commercial vendors work in exactly
> the same way. Sergey Puchnin and I are trying to better understand on how
> exactly compression could be implemented.
>
> What we understand now:
> 1) This will be dictionary-based compression (e.g. LZV)
> 2) Page will be compressed in batch mode. I.e. not on every change, but
> when certain threshold is reached (e.g. page's free space drops below 20%)
>
> What we do not understand yet:
> 1) Granularity of compression algorithm.
> 1.1) It could be per-entry - i.e. we compress the whole entry content, but
> respect boundaries between entries. E.g.: before - [ENTRY_1][ENTRY_2],
> after - [COMPRESSED_ENTRY_1][COMPRESSED_ENTRY_2] (as opposed to
> [COMPRESSED
> ENTRY_1 and ENTRY_2]).
> 1.2) Or it could be per-field - i.e. we compress fields, but respect binary
> object layout. First approach is simple, straightforward, and will give
> acceptable compression rate, but we will have to compress the whole binary
> object on every field access, what may ruin our SQL performance. Second
> approach is more complex, we are not sure about it's compression rate, but
> as BinaryObject structure is preserved, we will still have fast
> constant-time per-field access.
>
> Please share your thoughts.
>
> Vladimir.
>
>
> On Sat, Aug 12, 2017 at 4:08 AM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> > I still don't understand per-partition compression, which is only local.
> > How do entries get updated in the middle of a partition file?
> >
> > D.
> >
> > On Fri, Aug 11, 2017 at 3:44 AM, Yakov Zhdanov <[hidden email]>
> > wrote:
> >
> > > Ivan, your points definitely make sense, however, I see at list one
> issue
> > > compared to per-page approach. With per-page we can split page to
> > > compressed and uncompressed regions and change the dictionary if more
> > > efficient compression is possible. With bigger areas such as partition
> or
> > > entire cache dynamic dictionary change may be very complex.
> > >
> > > --Yakov
> > >
> >
>

daradurvs

Re: Data compression in Ignite

Hi, Igniters!

Are there any results of researching or a prototype of compression feature?

--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Dmitriy Pavlov

Re: Data compression in Ignite

Hi Igniters,

What do you think about implementing analogue of Java G1 collector featue
'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication

Most of business application has almost all objects of type String. As
result char[] array is often on top of heap usage. To reduce consumption by
duplicates G1 collector in background identifies and deduplicates strings
having equal array into one instance (as String is immutable).
Unfortunately we can’t reuse collector’s feature as Ignite stores data
off-heap.

What if we consider implementing same deduplication feature for Ignite
Durable Memory?

Sincerely,
Dmitry Pavlov

ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:

> Hi, Igniters!
>
> Are there any results of researching or a prototype of compression feature?
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>

Vladimir Ozerov

Re: Data compression in Ignite

Dmitry,

What we've discussed so far in this topic is essentially the same concept.
We will deduplicate same byte sequences on page level.

On Fri, Nov 10, 2017 at 6:10 PM, Dmitry Pavlov <[hidden email]>
wrote:

> Hi Igniters,
>
> What do you think about implementing analogue of Java G1 collector featue
> 'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication
>
> Most of business application has almost all objects of type String. As
> result char[] array is often on top of heap usage. To reduce consumption by
> duplicates G1 collector in background identifies and deduplicates strings
> having equal array into one instance (as String is immutable).
> Unfortunately we can’t reuse collector’s feature as Ignite stores data
> off-heap.
>
> What if we consider implementing same deduplication feature for Ignite
> Durable Memory?
>
> Sincerely,
> Dmitry Pavlov
>
>
> ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:
>
> > Hi, Igniters!
> >
> > Are there any results of researching or a prototype of compression
> feature?
> >
> >
> >
> > --
> > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >
>

Dmitriy Pavlov

Re: Data compression in Ignite

Vladimir,

orientation on string will also allow us to deduplicate strings in objects
during unmarshalling from page into heap.

Moreover, this can be first simple step of implementating more complex
algorithm.

Sincerely,
Dmitriy Pavlov

пт, 10 нояб. 2017 г. в 18:19, Vladimir Ozerov <[hidden email]>:

> Dmitry,
>
> What we've discussed so far in this topic is essentially the same concept.
> We will deduplicate same byte sequences on page level.
>
> On Fri, Nov 10, 2017 at 6:10 PM, Dmitry Pavlov <[hidden email]>
> wrote:
>
> > Hi Igniters,
> >
> > What do you think about implementing analogue of Java G1 collector featue
> > 'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication
> >
> > Most of business application has almost all objects of type String. As
> > result char[] array is often on top of heap usage. To reduce consumption
> by
> > duplicates G1 collector in background identifies and deduplicates strings
> > having equal array into one instance (as String is immutable).
> > Unfortunately we can’t reuse collector’s feature as Ignite stores data
> > off-heap.
> >
> > What if we consider implementing same deduplication feature for Ignite
> > Durable Memory?
> >
> > Sincerely,
> > Dmitry Pavlov
> >
> >
> > ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:
> >
> > > Hi, Igniters!
> > >
> > > Are there any results of researching or a prototype of compression
> > feature?
> > >
> > >
> > >
> > > --
> > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > >
> >
>

Vladimir Ozerov

Re: Data compression in Ignite

This would require shared dictionary, which is complex to maintain. We
evaluated this option, but rejected due to complexity. Another important
thing is that String type doesn't dominate in user models, so I do not see
why it should be a kind of special case.

пт, 10 нояб. 2017 г. в 18:45, Dmitry Pavlov <[hidden email]>:

> Vladimir,
>
> orientation on string will also allow us to deduplicate strings in objects
> during unmarshalling from page into heap.
>
> Moreover, this can be first simple step of implementating more complex
> algorithm.
>
> Sincerely,
> Dmitriy Pavlov
>
> пт, 10 нояб. 2017 г. в 18:19, Vladimir Ozerov <[hidden email]>:
>
> > Dmitry,
> >
> > What we've discussed so far in this topic is essentially the same
> concept.
> > We will deduplicate same byte sequences on page level.
> >
> > On Fri, Nov 10, 2017 at 6:10 PM, Dmitry Pavlov <[hidden email]>
> > wrote:
> >
> > > Hi Igniters,
> > >
> > > What do you think about implementing analogue of Java G1 collector
> featue
> > > 'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication
> > >
> > > Most of business application has almost all objects of type String. As
> > > result char[] array is often on top of heap usage. To reduce
> consumption
> > by
> > > duplicates G1 collector in background identifies and deduplicates
> strings
> > > having equal array into one instance (as String is immutable).
> > > Unfortunately we can’t reuse collector’s feature as Ignite stores data
> > > off-heap.
> > >
> > > What if we consider implementing same deduplication feature for Ignite
> > > Durable Memory?
> > >
> > > Sincerely,
> > > Dmitry Pavlov
> > >
> > >
> > > ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:
> > >
> > > > Hi, Igniters!
> > > >
> > > > Are there any results of researching or a prototype of compression
> > > feature?
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >
> > >
> >
>

Dmitriy Pavlov

Re: Data compression in Ignite

Hi Vladimir,

To my experience string is often used data type in business applications
and moreover, indexed.
> String type doesn't dominate in user models
what is the basis of this assumption?

Could you explain why String is more complex than byte[] compression. It
seems they both requires dictionaries.

Sincerely,
Dmitriy Pavlov

пт, 10 нояб. 2017 г. в 18:57, Vladimir Ozerov <[hidden email]>:

> This would require shared dictionary, which is complex to maintain. We
> evaluated this option, but rejected due to complexity. Another important
> thing is that String type doesn't dominate in user models, so I do not see
> why it should be a kind of special case.
>
> пт, 10 нояб. 2017 г. в 18:45, Dmitry Pavlov <[hidden email]>:
>
> > Vladimir,
> >
> > orientation on string will also allow us to deduplicate strings in
> objects
> > during unmarshalling from page into heap.
> >
> > Moreover, this can be first simple step of implementating more complex
> > algorithm.
> >
> > Sincerely,
> > Dmitriy Pavlov
> >
> > пт, 10 нояб. 2017 г. в 18:19, Vladimir Ozerov <[hidden email]>:
> >
> > > Dmitry,
> > >
> > > What we've discussed so far in this topic is essentially the same
> > concept.
> > > We will deduplicate same byte sequences on page level.
> > >
> > > On Fri, Nov 10, 2017 at 6:10 PM, Dmitry Pavlov <[hidden email]>
> > > wrote:
> > >
> > > > Hi Igniters,
> > > >
> > > > What do you think about implementing analogue of Java G1 collector
> > featue
> > > > 'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication
> > > >
> > > > Most of business application has almost all objects of type String.
> As
> > > > result char[] array is often on top of heap usage. To reduce
> > consumption
> > > by
> > > > duplicates G1 collector in background identifies and deduplicates
> > strings
> > > > having equal array into one instance (as String is immutable).
> > > > Unfortunately we can’t reuse collector’s feature as Ignite stores
> data
> > > > off-heap.
> > > >
> > > > What if we consider implementing same deduplication feature for
> Ignite
> > > > Durable Memory?
> > > >
> > > > Sincerely,
> > > > Dmitry Pavlov
> > > >
> > > >
> > > > ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:
> > > >
> > > > > Hi, Igniters!
> > > > >
> > > > > Are there any results of researching or a prototype of compression
> > > > feature?
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > >
> > > >
> > >
> >
>

Vladimir Ozerov

Re: Data compression in Ignite

Dmitry,

Ignite is used by a variety of applications. Some models I saw were made
completely of stirngs. Others - of longs and decimals, etc.. It is
impossible to either prove or disprove what is the dominant data type. My
position is based on experience with Ignite users and approaches used in
other databases.

Strings are more complex because you approach assumes that there is a
common dictionary with strings, and reference to these strings from data
pages. As soon as you have cross-page references, you are in trobule,
because you need to maintain that dictionary. WIth page based approach we
agreed previously, the dictionary is generic (i.e. it can compress not only
strings, but any byte sequence), and is located inside the page, meaning
that all you need to maintain this dictionary is page lock.

On Fri, Nov 10, 2017 at 7:02 PM, Dmitry Pavlov <[hidden email]>
wrote:

> Hi Vladimir,
>
> To my experience string is often used data type in business applications
> and moreover, indexed.
> > String type doesn't dominate in user models
> what is the basis of this assumption?
>
> Could you explain why String is more complex than byte[] compression. It
> seems they both requires dictionaries.
>
> Sincerely,
> Dmitriy Pavlov
>
> пт, 10 нояб. 2017 г. в 18:57, Vladimir Ozerov <[hidden email]>:
>
> > This would require shared dictionary, which is complex to maintain. We
> > evaluated this option, but rejected due to complexity. Another important
> > thing is that String type doesn't dominate in user models, so I do not
> see
> > why it should be a kind of special case.
> >
> > пт, 10 нояб. 2017 г. в 18:45, Dmitry Pavlov <[hidden email]>:
> >
> > > Vladimir,
> > >
> > > orientation on string will also allow us to deduplicate strings in
> > objects
> > > during unmarshalling from page into heap.
> > >
> > > Moreover, this can be first simple step of implementating more complex
> > > algorithm.
> > >
> > > Sincerely,
> > > Dmitriy Pavlov
> > >
> > > пт, 10 нояб. 2017 г. в 18:19, Vladimir Ozerov <[hidden email]>:
> > >
> > > > Dmitry,
> > > >
> > > > What we've discussed so far in this topic is essentially the same
> > > concept.
> > > > We will deduplicate same byte sequences on page level.
> > > >
> > > > On Fri, Nov 10, 2017 at 6:10 PM, Dmitry Pavlov <
> [hidden email]>
> > > > wrote:
> > > >
> > > > > Hi Igniters,
> > > > >
> > > > > What do you think about implementing analogue of Java G1 collector
> > > featue
> > > > > 'String deduplication': -XX:+UseG1GC -XX:+UseStringDeduplication
> > > > >
> > > > > Most of business application has almost all objects of type String.
> > As
> > > > > result char[] array is often on top of heap usage. To reduce
> > > consumption
> > > > by
> > > > > duplicates G1 collector in background identifies and deduplicates
> > > strings
> > > > > having equal array into one instance (as String is immutable).
> > > > > Unfortunately we can’t reuse collector’s feature as Ignite stores
> > data
> > > > > off-heap.
> > > > >
> > > > > What if we consider implementing same deduplication feature for
> > Ignite
> > > > > Durable Memory?
> > > > >
> > > > > Sincerely,
> > > > > Dmitry Pavlov
> > > > >
> > > > >
> > > > > ср, 18 окт. 2017 г. в 18:52, daradurvs <[hidden email]>:
> > > > >
> > > > > > Hi, Igniters!
> > > > > >
> > > > > > Are there any results of researching or a prototype of
> compression
> > > > > feature?
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.
> com/
> > > > > >
> > > > >
> > > >
> > >
> >
>