cache size() calculation for MVCC

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

cache size() calculation for MVCC

Sergey Kalashnikov
Hi Igniters,

I need your advice on a task at hand.

Currently cache API size() is a constant time operation, since the
number of entries is maintained as a separate counter.
However, for MVCC-enabled cache there can be multiple versions of the
same entry.
In order to calculate the size we need to obtain a MVCC snapshot and
then iterate over data pages filtering invisible versions.
So, it is impossible to keep the same complexity guarantees.

My current implementation internally switches to "full-scan" approach
if cache in question is a MVCC-enabled cache.
It happens unbeknown to users, which may expect lightning-fast
response as before.
Perhaps, we might add a new constant to CachePeekMode enumeration that
is passed to cache size() to make it explicit?

The second concern is that cache size calculation is also included
into Cache Metrics API and Visor functionality.
Will it be OK for metrics and things alike to keep returning raw
unfiltered number of entries?
Is there any sense in showing raw unfiltered number of entries which
may vary greatly from invokation to invokation with just simple
updates running in background?

Please share your thoughts.

Thanks in advance.
--
Sergey
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

Vladimir Ozerov
This is interesting question. Full-scan size may be tremendously slow
operation on large data sets. On the other hand, printing total number of
tuples including old and aborted versions make little to no sense as well.
Looks like we need to choose lesser of two evils. What if we do the
following:
1) Left default behavior as is - O(1) complexity, but includes invalid
versions
2) As Sergey suggested, add new peek mode "MVCC_ALIVE_ONLY" which will
perform full scan.

Alternatively we may throw an "UnsupportedOperationException" from this
method - why not?

Thoughts?

On Tue, Apr 24, 2018 at 4:28 PM, Sergey Kalashnikov <[hidden email]>
wrote:

> Hi Igniters,
>
> I need your advice on a task at hand.
>
> Currently cache API size() is a constant time operation, since the
> number of entries is maintained as a separate counter.
> However, for MVCC-enabled cache there can be multiple versions of the
> same entry.
> In order to calculate the size we need to obtain a MVCC snapshot and
> then iterate over data pages filtering invisible versions.
> So, it is impossible to keep the same complexity guarantees.
>
> My current implementation internally switches to "full-scan" approach
> if cache in question is a MVCC-enabled cache.
> It happens unbeknown to users, which may expect lightning-fast
> response as before.
> Perhaps, we might add a new constant to CachePeekMode enumeration that
> is passed to cache size() to make it explicit?
>
> The second concern is that cache size calculation is also included
> into Cache Metrics API and Visor functionality.
> Will it be OK for metrics and things alike to keep returning raw
> unfiltered number of entries?
> Is there any sense in showing raw unfiltered number of entries which
> may vary greatly from invokation to invokation with just simple
> updates running in background?
>
> Please share your thoughts.
>
> Thanks in advance.
> --
> Sergey
>
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

yzhdanov
Guys,

How do we update counter right now?

If we move to fair thread-per-partition we can update counter only if we
add new key and skip if we add or remove a version. Thoughts?

--Yakov

2018-04-25 12:07 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> This is interesting question. Full-scan size may be tremendously slow
> operation on large data sets. On the other hand, printing total number of
> tuples including old and aborted versions make little to no sense as well.
> Looks like we need to choose lesser of two evils. What if we do the
> following:
> 1) Left default behavior as is - O(1) complexity, but includes invalid
> versions
> 2) As Sergey suggested, add new peek mode "MVCC_ALIVE_ONLY" which will
> perform full scan.
>
> Alternatively we may throw an "UnsupportedOperationException" from this
> method - why not?
>
> Thoughts?
>
> On Tue, Apr 24, 2018 at 4:28 PM, Sergey Kalashnikov <[hidden email]
> >
> wrote:
>
> > Hi Igniters,
> >
> > I need your advice on a task at hand.
> >
> > Currently cache API size() is a constant time operation, since the
> > number of entries is maintained as a separate counter.
> > However, for MVCC-enabled cache there can be multiple versions of the
> > same entry.
> > In order to calculate the size we need to obtain a MVCC snapshot and
> > then iterate over data pages filtering invisible versions.
> > So, it is impossible to keep the same complexity guarantees.
> >
> > My current implementation internally switches to "full-scan" approach
> > if cache in question is a MVCC-enabled cache.
> > It happens unbeknown to users, which may expect lightning-fast
> > response as before.
> > Perhaps, we might add a new constant to CachePeekMode enumeration that
> > is passed to cache size() to make it explicit?
> >
> > The second concern is that cache size calculation is also included
> > into Cache Metrics API and Visor functionality.
> > Will it be OK for metrics and things alike to keep returning raw
> > unfiltered number of entries?
> > Is there any sense in showing raw unfiltered number of entries which
> > may vary greatly from invokation to invokation with just simple
> > updates running in background?
> >
> > Please share your thoughts.
> >
> > Thanks in advance.
> > --
> > Sergey
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

Vladimir Ozerov
Yakov,

Thread-per-partition is hardly applicable for general SQL use case as user
operates on arbitrary data sets. But in general we may track size deltas
for partitions on transaction level. If transaction span one or several
partitions, we may hold this data in a single long or Map. If transaction
spans a lot of partitions, we may store this data in array.

What do you think?

On Wed, Apr 25, 2018 at 12:18 PM, Yakov Zhdanov <[hidden email]> wrote:

> Guys,
>
> How do we update counter right now?
>
> If we move to fair thread-per-partition we can update counter only if we
> add new key and skip if we add or remove a version. Thoughts?
>
> --Yakov
>
> 2018-04-25 12:07 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > This is interesting question. Full-scan size may be tremendously slow
> > operation on large data sets. On the other hand, printing total number of
> > tuples including old and aborted versions make little to no sense as
> well.
> > Looks like we need to choose lesser of two evils. What if we do the
> > following:
> > 1) Left default behavior as is - O(1) complexity, but includes invalid
> > versions
> > 2) As Sergey suggested, add new peek mode "MVCC_ALIVE_ONLY" which will
> > perform full scan.
> >
> > Alternatively we may throw an "UnsupportedOperationException" from this
> > method - why not?
> >
> > Thoughts?
> >
> > On Tue, Apr 24, 2018 at 4:28 PM, Sergey Kalashnikov <
> [hidden email]
> > >
> > wrote:
> >
> > > Hi Igniters,
> > >
> > > I need your advice on a task at hand.
> > >
> > > Currently cache API size() is a constant time operation, since the
> > > number of entries is maintained as a separate counter.
> > > However, for MVCC-enabled cache there can be multiple versions of the
> > > same entry.
> > > In order to calculate the size we need to obtain a MVCC snapshot and
> > > then iterate over data pages filtering invisible versions.
> > > So, it is impossible to keep the same complexity guarantees.
> > >
> > > My current implementation internally switches to "full-scan" approach
> > > if cache in question is a MVCC-enabled cache.
> > > It happens unbeknown to users, which may expect lightning-fast
> > > response as before.
> > > Perhaps, we might add a new constant to CachePeekMode enumeration that
> > > is passed to cache size() to make it explicit?
> > >
> > > The second concern is that cache size calculation is also included
> > > into Cache Metrics API and Visor functionality.
> > > Will it be OK for metrics and things alike to keep returning raw
> > > unfiltered number of entries?
> > > Is there any sense in showing raw unfiltered number of entries which
> > > may vary greatly from invokation to invokation with just simple
> > > updates running in background?
> > >
> > > Please share your thoughts.
> > >
> > > Thanks in advance.
> > > --
> > > Sergey
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

Sergey Kalashnikov
Guys,

The cache size counter is actually a set of per-partition counters.

2018-04-25 12:45 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Yakov,
>
> Thread-per-partition is hardly applicable for general SQL use case as user
> operates on arbitrary data sets. But in general we may track size deltas for
> partitions on transaction level. If transaction span one or several
> partitions, we may hold this data in a single long or Map. If transaction
> spans a lot of partitions, we may store this data in array.
>
> What do you think?
>
> On Wed, Apr 25, 2018 at 12:18 PM, Yakov Zhdanov <[hidden email]> wrote:
>>
>> Guys,
>>
>> How do we update counter right now?
>>
>> If we move to fair thread-per-partition we can update counter only if we
>> add new key and skip if we add or remove a version. Thoughts?
>>
>> --Yakov
>>
>> 2018-04-25 12:07 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>>
>> > This is interesting question. Full-scan size may be tremendously slow
>> > operation on large data sets. On the other hand, printing total number
>> > of
>> > tuples including old and aborted versions make little to no sense as
>> > well.
>> > Looks like we need to choose lesser of two evils. What if we do the
>> > following:
>> > 1) Left default behavior as is - O(1) complexity, but includes invalid
>> > versions
>> > 2) As Sergey suggested, add new peek mode "MVCC_ALIVE_ONLY" which will
>> > perform full scan.
>> >
>> > Alternatively we may throw an "UnsupportedOperationException" from this
>> > method - why not?
>> >
>> > Thoughts?
>> >
>> > On Tue, Apr 24, 2018 at 4:28 PM, Sergey Kalashnikov
>> > <[hidden email]
>> > >
>> > wrote:
>> >
>> > > Hi Igniters,
>> > >
>> > > I need your advice on a task at hand.
>> > >
>> > > Currently cache API size() is a constant time operation, since the
>> > > number of entries is maintained as a separate counter.
>> > > However, for MVCC-enabled cache there can be multiple versions of the
>> > > same entry.
>> > > In order to calculate the size we need to obtain a MVCC snapshot and
>> > > then iterate over data pages filtering invisible versions.
>> > > So, it is impossible to keep the same complexity guarantees.
>> > >
>> > > My current implementation internally switches to "full-scan" approach
>> > > if cache in question is a MVCC-enabled cache.
>> > > It happens unbeknown to users, which may expect lightning-fast
>> > > response as before.
>> > > Perhaps, we might add a new constant to CachePeekMode enumeration that
>> > > is passed to cache size() to make it explicit?
>> > >
>> > > The second concern is that cache size calculation is also included
>> > > into Cache Metrics API and Visor functionality.
>> > > Will it be OK for metrics and things alike to keep returning raw
>> > > unfiltered number of entries?
>> > > Is there any sense in showing raw unfiltered number of entries which
>> > > may vary greatly from invokation to invokation with just simple
>> > > updates running in background?
>> > >
>> > > Please share your thoughts.
>> > >
>> > > Thanks in advance.
>> > > --
>> > > Sergey
>> > >
>> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

Dmitriy Setrakyan-2
In reply to this post by Vladimir Ozerov
Vova, what about maintaining the size as you go? This way you don't have to
count multiple versions, no?

D.

On Wed, Apr 25, 2018, 5:07 PM Vladimir Ozerov <[hidden email]> wrote:

> This is interesting question. Full-scan size may be tremendously slow
> operation on large data sets. On the other hand, printing total number of
> tuples including old and aborted versions make little to no sense as well.
> Looks like we need to choose lesser of two evils. What if we do the
> following:
> 1) Left default behavior as is - O(1) complexity, but includes invalid
> versions
> 2) As Sergey suggested, add new peek mode "MVCC_ALIVE_ONLY" which will
> perform full scan.
>
> Alternatively we may throw an "UnsupportedOperationException" from this
> method - why not?
>
> Thoughts?
>
> On Tue, Apr 24, 2018 at 4:28 PM, Sergey Kalashnikov <[hidden email]
> >
> wrote:
>
> > Hi Igniters,
> >
> > I need your advice on a task at hand.
> >
> > Currently cache API size() is a constant time operation, since the
> > number of entries is maintained as a separate counter.
> > However, for MVCC-enabled cache there can be multiple versions of the
> > same entry.
> > In order to calculate the size we need to obtain a MVCC snapshot and
> > then iterate over data pages filtering invisible versions.
> > So, it is impossible to keep the same complexity guarantees.
> >
> > My current implementation internally switches to "full-scan" approach
> > if cache in question is a MVCC-enabled cache.
> > It happens unbeknown to users, which may expect lightning-fast
> > response as before.
> > Perhaps, we might add a new constant to CachePeekMode enumeration that
> > is passed to cache size() to make it explicit?
> >
> > The second concern is that cache size calculation is also included
> > into Cache Metrics API and Visor functionality.
> > Will it be OK for metrics and things alike to keep returning raw
> > unfiltered number of entries?
> > Is there any sense in showing raw unfiltered number of entries which
> > may vary greatly from invokation to invokation with just simple
> > updates running in background?
> >
> > Please share your thoughts.
> >
> > Thanks in advance.
> > --
> > Sergey
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

dsetrakyan
In reply to this post by Vladimir Ozerov
On Wed, Apr 25, 2018 at 5:45 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Yakov,
>
> Thread-per-partition is hardly applicable for general SQL use case as user
> operates on arbitrary data sets. But in general we may track size deltas
> for partitions on transaction level. If transaction span one or several
> partitions, we may hold this data in a single long or Map. If transaction
> spans a lot of partitions, we may store this data in array.
>
> What do you think?
>

Vova, I think you are suggesting maintaining the size as you go. This does
sound like a right approach, assuming that it does not carry noticeable
overhead.

D.
Reply | Threaded
Open this post in threaded view
|

Re: cache size() calculation for MVCC

Vladimir Ozerov
Yes.

On Wed, Apr 25, 2018 at 10:28 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> On Wed, Apr 25, 2018 at 5:45 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Yakov,
> >
> > Thread-per-partition is hardly applicable for general SQL use case as
> user
> > operates on arbitrary data sets. But in general we may track size deltas
> > for partitions on transaction level. If transaction span one or several
> > partitions, we may hold this data in a single long or Map. If transaction
> > spans a lot of partitions, we may store this data in array.
> >
> > What do you think?
> >
>
> Vova, I think you are suggesting maintaining the size as you go. This does
> sound like a right approach, assuming that it does not carry noticeable
> overhead.
>
> D.
>