Apache Ignite Developers - Legacy Mail Archive

MVCC configuration

Classic

List

Threaded

29 messages Options

Semyon Boikov-2

MVCC configuration

Hi all,

Currently I'm working on MVCC feature (IGNITE-3478) and need your opinion
on related configuration options.

1. MVCC will definitely bring some performance overhead, so I think it
should not be enabled by default, I'm going to add special flag on cache
configuration: CacheConfiguration.isMvccEnabled.

2. In current mvcc architecture there should be some node in cluster
assigning versions for tx updates and queries (mvcc coordinator). Mvcc
coordinator is crucial component and it should perform as fast as possible.
It seems we need introduce special 'dedicated mvcc coordinator' node role:
it should not be possible to start cache on such node and it should not
process user's compute jobs. At the same time it should be possible that
any regular server node can become mvcc coordinator: this can be useful
during development (no extra setup for mvcc will be needed), or support
scenario when all dedicated coordinator nodes fail. So we need a way to
make node a 'dedicated mvcc coordinator', we can add special flag on ignite
configuration: IgniteConfiguration.isMvccCoordinator.

What do you think?

Thanks

Nikolay Izhikov

Re: MVCC configuration

Hello, Semyon!

> It seems we need introduce special 'dedicated mvcc coordinator' node role

How will Ignite handle "mvcc coordinator" fail?

What will happen with if coordinator fails in the middle of a transaction?
Could tx be committed or rollbacked?

Will we have some user notification if coordinator becomes slower?

> IgniteConfiguration.isMvccCoordinator

flag name seems OK.

2017-09-18 12:39 GMT+03:00 Semyon Boikov <[hidden email]>:

> Hi all,
>
> Currently I'm working on MVCC feature (IGNITE-3478) and need your opinion
> on related configuration options.
>
> 1. MVCC will definitely bring some performance overhead, so I think it
> should not be enabled by default, I'm going to add special flag on cache
> configuration: CacheConfiguration.isMvccEnabled.
>
> 2. In current mvcc architecture there should be some node in cluster
> assigning versions for tx updates and queries (mvcc coordinator). Mvcc
> coordinator is crucial component and it should perform as fast as possible.
> It seems we need introduce special 'dedicated mvcc coordinator' node role:
> it should not be possible to start cache on such node and it should not
> process user's compute jobs. At the same time it should be possible that
> any regular server node can become mvcc coordinator: this can be useful
> during development (no extra setup for mvcc will be needed), or support
> scenario when all dedicated coordinator nodes fail. So we need a way to
> make node a 'dedicated mvcc coordinator', we can add special flag on ignite
> configuration: IgniteConfiguration.isMvccCoordinator.
>
> What do you think?
>
> Thanks
>

--
Nikolay Izhikov
[hidden email]

Vladimir Ozerov

Re: MVCC configuration

In reply to this post by Semyon Boikov-2

Semen,

My comments:
1) I would propose to have only global flag for now -
IgniteConfiguration.isMvccEnabled.
One key design point we should keep in mind is that MVCC data *MSUT* be
persistent. We can skip it in the first iteration, as we are focused on
key-based cache updates, when typical transaction will update dozens or
hundreds keys. But when transactional SQL is ready, we will have to handle
cases when many thousands and millions rows are updated from concurrent
transactions. Without storing MVCC data on disk we will run out of memory.
Other vendors such as Oracle and Postgres, store MVCC-related data in data
blocks. So there is a risk that we will not be able to manage both MVCC and
non-MVCC caches in a single cache group or single memory policy. So
per-cache configuration looks too dangerous for me at the moment.

2) I would also avoid this flag until we clearly understand it is needed.
All numbers will be assigned from a single thread. For this reason even
peak load on coordinator should not consume too much resources. I think we
can assign coordinators automatically in first iteration.

So my vote is to have single global flag, nothing more -
IgniteConfiguration.isMvccEnabled.

Vladimir.

On Mon, Sep 18, 2017 at 12:39 PM, Semyon Boikov <[hidden email]> wrote:

yzhdanov

Re: MVCC configuration

In reply to this post by Nikolay Izhikov

1. Agree. Let's disable MVCC by default.
2. Sam, if user wants to have dedicated mvcc-coordinator, then we can use
configuration you suggested. However, I expect more properties will be
needed. How about having MvccConfiguration bean? Once topology has no
dedicated coordinators, topology should pick up some ordinary server (maybe
based on some stats about load and current partition distribution).

One more point - user should have an ability to assign coordinator
manually. I am pretty sure we can do it via custom discovery message.

--Yakov

yzhdanov

Re: MVCC configuration

In reply to this post by Vladimir Ozerov

Vladimir, should it be on IgniteConfiguration or on CacheConfiguration? I
think mvcc should be enabled on per cache basis and moreover it makes sense
only for tx caches.

--Yakov

Vladimir Ozerov

Re: MVCC configuration

Yakov,

MVCC for atomic caches makes sense as well - we will be able to read
consistent data set, which is not possible now. As I explained above,
per-cache configuration might not work when we start working on
transactional SQL design.

Moreover, it looks like an overkill for me at the moment. We will need
global flag anyway - this is convenient, as many application will require
all data to be MVCC-protected. So ideal solution would be to have
IgniteConfiguration.mvccEnabled
+ CacheConfiguration.mvccEnabled, but the latter could be skipped in the
first iteration.

On Mon, Sep 18, 2017 at 1:25 PM, Yakov Zhdanov <[hidden email]> wrote:

> Vladimir, should it be on IgniteConfiguration or on CacheConfiguration? I
> think mvcc should be enabled on per cache basis and moreover it makes sense
> only for tx caches.
>
> --Yakov
>

yzhdanov

Re: MVCC configuration

Ouch... of course it makes sense for atomic caches. Seems I am not fully
switched on after weekend =)

Agree on other points.

--Yakov

Semyon Boikov

Re: MVCC configuration

Guys,

I do not really understand mvcc for atomic cache, could you please provide
some real use case.

Thank you

On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <[hidden email]> wrote:

> Ouch... of course it makes sense for atomic caches. Seems I am not fully
> switched on after weekend =)
>
> Agree on other points.
>
> --Yakov
>

Vladimir Ozerov

Re: MVCC configuration

Semen,

Consider use case of some audit table where I log user actions over time.
Every actions is a put to ATOMIC cache. User interacts with my application,
and performs the following set of actions:
1. 08:00 MSK -> LOGIN
2. 08:10 MSK -> Update something
3. 08:20 MSK -> LOGUT

If MVCC is there, whenever I query all actions performed by the user, I
would see either {}, {1}, {1, 2} or {1, 2, 3}
Without MVCC I can see weird things, such as {1, 3} or {2}, or whatsoever.

Vladimir.

On Mon, Sep 18, 2017 at 1:41 PM, Semyon Boikov <[hidden email]> wrote:

> Guys,
>
> I do not really understand mvcc for atomic cache, could you please provide
> some real use case.
>
> Thank you
>
> On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <[hidden email]>
> wrote:
>
> > Ouch... of course it makes sense for atomic caches. Seems I am not fully
> > switched on after weekend =)
> >
> > Agree on other points.
> >
> > --Yakov
> >
>

yzhdanov

Re: MVCC configuration

Vladimir, I think we can ask user to switch to transactional cache to
support your example. Otherwise, it seems we are turning atomic caches to
tx implicitly.

--Yakov

2017-09-18 13:49 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Semen,
>
> Consider use case of some audit table where I log user actions over time.
> Every actions is a put to ATOMIC cache. User interacts with my application,
> and performs the following set of actions:
> 1. 08:00 MSK -> LOGIN
> 2. 08:10 MSK -> Update something
> 3. 08:20 MSK -> LOGUT
>
> If MVCC is there, whenever I query all actions performed by the user, I
> would see either {}, {1}, {1, 2} or {1, 2, 3}
> Without MVCC I can see weird things, such as {1, 3} or {2}, or whatsoever.
>
> Vladimir.
>
>
> On Mon, Sep 18, 2017 at 1:41 PM, Semyon Boikov <[hidden email]>
> wrote:
>
> > Guys,
> >
> > I do not really understand mvcc for atomic cache, could you please
> provide
> > some real use case.
> >
> > Thank you
> >
> > On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <[hidden email]>
> > wrote:
> >
> > > Ouch... of course it makes sense for atomic caches. Seems I am not
> fully
> > > switched on after weekend =)
> > >
> > > Agree on other points.
> > >
> > > --Yakov
> > >
> >
>

Vladimir Ozerov

Re: MVCC configuration

Yakov,

I would say that my example is not about adding transactions to ATOMIC
cache, but rather about adding consistent snapshots to it.

On Mon, Sep 18, 2017 at 1:59 PM, Yakov Zhdanov <[hidden email]> wrote:

> Vladimir, I think we can ask user to switch to transactional cache to
> support your example. Otherwise, it seems we are turning atomic caches to
> tx implicitly.
>
> --Yakov
>
> 2017-09-18 13:49 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Semen,
> >
> > Consider use case of some audit table where I log user actions over time.
> > Every actions is a put to ATOMIC cache. User interacts with my
> application,
> > and performs the following set of actions:
> > 1. 08:00 MSK -> LOGIN
> > 2. 08:10 MSK -> Update something
> > 3. 08:20 MSK -> LOGUT
> >
> > If MVCC is there, whenever I query all actions performed by the user, I
> > would see either {}, {1}, {1, 2} or {1, 2, 3}
> > Without MVCC I can see weird things, such as {1, 3} or {2}, or
> whatsoever.
> >
> > Vladimir.
> >
> >
> > On Mon, Sep 18, 2017 at 1:41 PM, Semyon Boikov <[hidden email]>
> > wrote:
> >
> > > Guys,
> > >
> > > I do not really understand mvcc for atomic cache, could you please
> > provide
> > > some real use case.
> > >
> > > Thank you
> > >
> > > On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <[hidden email]>
> > > wrote:
> > >
> > > > Ouch... of course it makes sense for atomic caches. Seems I am not
> > fully
> > > > switched on after weekend =)
> > > >
> > > > Agree on other points.
> > > >
> > > > --Yakov
> > > >
> > >
> >
>

Alexey Goncharuk

Re: MVCC configuration

Vladimir,

I doubt it will be possible to add any meaningful guarantees to ATOMIC
caches with MVCC. Consider a case when a user does a putAll, not a single
put. In this case, updates received by multiple primary nodes are not
connected in any way. Moreover, whenever a primary node fails, the put for
failed keys will be re-tried, which will lead to all sorts of overlapping
updates in case of parallel putAll. It is hard to suggest how we should
handle this, let alone explain this to a user.

-- AG

2017-09-18 14:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Yakov,
>
> I would say that my example is not about adding transactions to ATOMIC
> cache, but rather about adding consistent snapshots to it.
>
> On Mon, Sep 18, 2017 at 1:59 PM, Yakov Zhdanov <[hidden email]>
> wrote:
>
> > Vladimir, I think we can ask user to switch to transactional cache to
> > support your example. Otherwise, it seems we are turning atomic caches to
> > tx implicitly.
> >
> > --Yakov
> >
> > 2017-09-18 13:49 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> >
> > > Semen,
> > >
> > > Consider use case of some audit table where I log user actions over
> time.
> > > Every actions is a put to ATOMIC cache. User interacts with my
> > application,
> > > and performs the following set of actions:
> > > 1. 08:00 MSK -> LOGIN
> > > 2. 08:10 MSK -> Update something
> > > 3. 08:20 MSK -> LOGUT
> > >
> > > If MVCC is there, whenever I query all actions performed by the user, I
> > > would see either {}, {1}, {1, 2} or {1, 2, 3}
> > > Without MVCC I can see weird things, such as {1, 3} or {2}, or
> > whatsoever.
> > >
> > > Vladimir.
> > >
> > >
> > > On Mon, Sep 18, 2017 at 1:41 PM, Semyon Boikov <[hidden email]>
> > > wrote:
> > >
> > > > Guys,
> > > >
> > > > I do not really understand mvcc for atomic cache, could you please
> > > provide
> > > > some real use case.
> > > >
> > > > Thank you
> > > >
> > > > On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <[hidden email]>
> > > > wrote:
> > > >
> > > > > Ouch... of course it makes sense for atomic caches. Seems I am not
> > > fully
> > > > > switched on after weekend =)
> > > > >
> > > > > Agree on other points.
> > > > >
> > > > > --Yakov
> > > > >
> > > >
> > >
> >
>

Vladimir Ozerov

Re: MVCC configuration

Alex,

With putAll() on ATOMIC cache all bets are off, for sure.

On Mon, Sep 18, 2017 at 2:53 PM, Alexey Goncharuk <
[hidden email]> wrote:

> Vladimir,
>
> I doubt it will be possible to add any meaningful guarantees to ATOMIC
> caches with MVCC. Consider a case when a user does a putAll, not a single
> put. In this case, updates received by multiple primary nodes are not
> connected in any way. Moreover, whenever a primary node fails, the put for
> failed keys will be re-tried, which will lead to all sorts of overlapping
> updates in case of parallel putAll. It is hard to suggest how we should
> handle this, let alone explain this to a user.
>
> -- AG
>
> 2017-09-18 14:50 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Yakov,
> >
> > I would say that my example is not about adding transactions to ATOMIC
> > cache, but rather about adding consistent snapshots to it.
> >
> > On Mon, Sep 18, 2017 at 1:59 PM, Yakov Zhdanov <[hidden email]>
> > wrote:
> >
> > > Vladimir, I think we can ask user to switch to transactional cache to
> > > support your example. Otherwise, it seems we are turning atomic caches
> to
> > > tx implicitly.
> > >
> > > --Yakov
> > >
> > > 2017-09-18 13:49 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > >
> > > > Semen,
> > > >
> > > > Consider use case of some audit table where I log user actions over
> > time.
> > > > Every actions is a put to ATOMIC cache. User interacts with my
> > > application,
> > > > and performs the following set of actions:
> > > > 1. 08:00 MSK -> LOGIN
> > > > 2. 08:10 MSK -> Update something
> > > > 3. 08:20 MSK -> LOGUT
> > > >
> > > > If MVCC is there, whenever I query all actions performed by the
> user, I
> > > > would see either {}, {1}, {1, 2} or {1, 2, 3}
> > > > Without MVCC I can see weird things, such as {1, 3} or {2}, or
> > > whatsoever.
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Mon, Sep 18, 2017 at 1:41 PM, Semyon Boikov <[hidden email]
> >
> > > > wrote:
> > > >
> > > > > Guys,
> > > > >
> > > > > I do not really understand mvcc for atomic cache, could you please
> > > > provide
> > > > > some real use case.
> > > > >
> > > > > Thank you
> > > > >
> > > > > On Mon, Sep 18, 2017 at 1:37 PM, Yakov Zhdanov <
> [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Ouch... of course it makes sense for atomic caches. Seems I am
> not
> > > > fully
> > > > > > switched on after weekend =)
> > > > > >
> > > > > > Agree on other points.
> > > > > >
> > > > > > --Yakov
> > > > > >
> > > > >
> > > >
> > >
> >
>

Semyon Boikov

Re: MVCC configuration

In reply to this post by Nikolay Izhikov

Nikolay, thanks for comments

> How will Ignite handle "mvcc coordinator" fail?
> What will happen with if coordinator fails in the middle of a transaction?
> Could tx be committed or rollbacked?

I think coordinator failure will be handled in the same way as failure of
one of transaction's 'primary' node: if coordinator fails during 'prepare'
phase then tx is rolledback.

>> Will we have some user notification if coordinator becomes slower?

Now in Ignite we do not have common notion of 'user notification's, but we
can add some metrics for coordinator performance on public API.

Thanks

On Mon, Sep 18, 2017 at 1:01 PM, Николай Ижиков <[hidden email]>
wrote:

> Hello, Semyon!
>
> > It seems we need introduce special 'dedicated mvcc coordinator' node role
>
> How will Ignite handle "mvcc coordinator" fail?
>
> What will happen with if coordinator fails in the middle of a transaction?
> Could tx be committed or rollbacked?
>
> Will we have some user notification if coordinator becomes slower?
>
> > IgniteConfiguration.isMvccCoordinator
>
> flag name seems OK.
>
>
> 2017-09-18 12:39 GMT+03:00 Semyon Boikov <[hidden email]>:
>
> > Hi all,
> >
> > Currently I'm working on MVCC feature (IGNITE-3478) and need your opinion
> > on related configuration options.
> >
> > 1. MVCC will definitely bring some performance overhead, so I think it
> > should not be enabled by default, I'm going to add special flag on cache
> > configuration: CacheConfiguration.isMvccEnabled.
> >
> > 2. In current mvcc architecture there should be some node in cluster
> > assigning versions for tx updates and queries (mvcc coordinator). Mvcc
> > coordinator is crucial component and it should perform as fast as
> possible.
> > It seems we need introduce special 'dedicated mvcc coordinator' node
> role:
> > it should not be possible to start cache on such node and it should not
> > process user's compute jobs. At the same time it should be possible that
> > any regular server node can become mvcc coordinator: this can be useful
> > during development (no extra setup for mvcc will be needed), or support
> > scenario when all dedicated coordinator nodes fail. So we need a way to
> > make node a 'dedicated mvcc coordinator', we can add special flag on
> ignite
> > configuration: IgniteConfiguration.isMvccCoordinator.
> >
> > What do you think?
> >
> > Thanks
> >
>
>
>
> --
> Nikolay Izhikov
> [hidden email]
>

Semyon Boikov

Re: MVCC configuration

Vladimir, thanks for comments

> 2) I would also avoid this flag until we clearly understand it is needed.
> All numbers will be assigned from a single thread. For this reason even
> peak load on coordinator should not consume too much resources. I think we
> can assign coordinators automatically in first iteration.

For me need of dedicated coordinator nodes is clear: each mvcc
transaction/query will wait for mvcc coordinator response, if coordinator
will also process cache operations/compute jobs then any user code executed
on coordinator and consuming lot of CPU/heap will slowdown ALL mvcc
transactions/queries. As a user I want to make sure that coordinator node
will process only internal requests related to mvcc.

Also why do you think that all numbers should be assigned from single
thread?

Thanks

On Mon, Sep 18, 2017 at 2:59 PM, Semyon Boikov <[hidden email]> wrote:

> Nikolay, thanks for comments
>
>
> > How will Ignite handle "mvcc coordinator" fail?
> > What will happen with if coordinator fails in the middle of a
> transaction?
> > Could tx be committed or rollbacked?
>
> I think coordinator failure will be handled in the same way as failure of
> one of transaction's 'primary' node: if coordinator fails during 'prepare'
> phase then tx is rolledback.
>
> >> Will we have some user notification if coordinator becomes slower?
>
> Now in Ignite we do not have common notion of 'user notification's, but we
> can add some metrics for coordinator performance on public API.
>
> Thanks
>
>
> On Mon, Sep 18, 2017 at 1:01 PM, Николай Ижиков <[hidden email]>
> wrote:
>
>> Hello, Semyon!
>>
>> > It seems we need introduce special 'dedicated mvcc coordinator' node
>> role
>>
>> How will Ignite handle "mvcc coordinator" fail?
>>
>> What will happen with if coordinator fails in the middle of a transaction?
>> Could tx be committed or rollbacked?
>>
>> Will we have some user notification if coordinator becomes slower?
>>
>> > IgniteConfiguration.isMvccCoordinator
>>
>> flag name seems OK.
>>
>>
>> 2017-09-18 12:39 GMT+03:00 Semyon Boikov <[hidden email]>:
>>
>> > Hi all,
>> >
>> > Currently I'm working on MVCC feature (IGNITE-3478) and need your
>> opinion
>> > on related configuration options.
>> >
>> > 1. MVCC will definitely bring some performance overhead, so I think it
>> > should not be enabled by default, I'm going to add special flag on cache
>> > configuration: CacheConfiguration.isMvccEnabled.
>> >
>> > 2. In current mvcc architecture there should be some node in cluster
>> > assigning versions for tx updates and queries (mvcc coordinator). Mvcc
>> > coordinator is crucial component and it should perform as fast as
>> possible.
>> > It seems we need introduce special 'dedicated mvcc coordinator' node
>> role:
>> > it should not be possible to start cache on such node and it should not
>> > process user's compute jobs. At the same time it should be possible that
>> > any regular server node can become mvcc coordinator: this can be useful
>> > during development (no extra setup for mvcc will be needed), or support
>> > scenario when all dedicated coordinator nodes fail. So we need a way to
>> > make node a 'dedicated mvcc coordinator', we can add special flag on
>> ignite
>> > configuration: IgniteConfiguration.isMvccCoordinator.
>> >
>> > What do you think?
>> >
>> > Thanks
>> >
>>
>>
>>
>> --
>> Nikolay Izhikov
>> [hidden email]
>>
>
>

Alexey Kuznetsov

Re: MVCC configuration

In reply to this post by Semyon Boikov-2

Semyon,

How about to have node attribute "COORDINATOR_RANK" or "COORDINATOR_ORDER"?
This attribute can be 1, 2, 3....
And node with minimal number will become coordinator.
If it failed, node with next rank/order will be elected as new coordinator.

Make sense?

On Mon, Sep 18, 2017 at 4:39 PM, Semyon Boikov <[hidden email]> wrote:

--
Alexey Kuznetsov

dsetrakyan

Re: MVCC configuration

In reply to this post by Vladimir Ozerov

On Mon, Sep 18, 2017 at 4:57 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Alex,
>
> With putAll() on ATOMIC cache all bets are off, for sure.
>

Are we all in agreement that MVCC should only be enabled for transactional
caches then?

dsetrakyan

Re: MVCC configuration

In reply to this post by Semyon Boikov-2

On Mon, Sep 18, 2017 at 2:39 AM, Semyon Boikov <[hidden email]> wrote:

>
> 1. MVCC will definitely bring some performance overhead, so I think it
> should not be enabled by default, I'm going to add special flag on cache
> configuration: CacheConfiguration.isMvccEnabled.
>

Is it possible for several caches in the same cache group to have different
MVCC configuration?

> 2. In current mvcc architecture there should be some node in cluster
> assigning versions for tx updates and queries (mvcc coordinator). Mvcc
> coordinator is crucial component and it should perform as fast as possible.
> It seems we need introduce special 'dedicated mvcc coordinator' node role:
> it should not be possible to start cache on such node and it should not
> process user's compute jobs. At the same time it should be possible that
> any regular server node can become mvcc coordinator: this can be useful
> during development (no extra setup for mvcc will be needed), or support
> scenario when all dedicated coordinator nodes fail. So we need a way to
> make node a 'dedicated mvcc coordinator', we can add special flag on ignite
> configuration: IgniteConfiguration.isMvccCoordinator.
>

I agree that we need coordinator nodes, but I do not understand why can't
we reuse some cache nodes for it? Why do we need to ask user to start up
yet another type of node?

Alexey Goncharuk

Re: MVCC configuration

>
> I agree that we need coordinator nodes, but I do not understand why can't
> we reuse some cache nodes for it? Why do we need to ask user to start up
> yet another type of node?
>

Dmitriy,

My understanding is that Semyon does not deny a cache node to be used as a
coordinator. This property will allow to optionally have a *dedicated* node
serving as a coordinator to improve cluster throughput under heavy load.

Vladimir Ozerov

Re: MVCC configuration

This could be something like "preferredMvccCoordinator".

On Tue, Sep 19, 2017 at 10:40 AM, Alexey Goncharuk <
[hidden email]> wrote:

> >
> > I agree that we need coordinator nodes, but I do not understand why can't
> > we reuse some cache nodes for it? Why do we need to ask user to start up
> > yet another type of node?
> >
>
> Dmitriy,
>
> My understanding is that Semyon does not deny a cache node to be used as a
> coordinator. This property will allow to optionally have a *dedicated* node
> serving as a coordinator to improve cluster throughput under heavy load.
>