Apache Ignite Developers - Legacy Mail Archive

Partition loss policy to disable cache completely

Classic

List

Threaded

10 messages Options

Valentin Kulichenko

Partition loss policy to disable cache completely

Folks,

Our PartitionLossPolicy allows to disable operations on lost partitions,
however all available policies allow any operations on partitions that were
not lost. It seems to me it can be very useful to also have a policy that
completely blocks the cache in case of data loss. Is it possible to add one?

And as a side question: what is the difference between READ_WRITE_ALL and
IGNORE policies? Looks like both allow both read and write on all
partitions.

-Val

yzhdanov

Re: Partition loss policy to disable cache completely

Val,

Your suggestion to prohibit any cache operation on partition loss does not
make sense to me. Why should I care about some partition during particular
operation if I don't access it? Imagine I use data on nodes A and B
performing reads and writes and node C crashes in the middle of tx. Should
my tx be rolled back? I think no.

As far as difference it seems that IGNORE resets lost status for affected
partitions and READ_WRITE_ALL does not.

* @see Ignite#resetLostPartitions(Collection)
* @see IgniteCache#lostPartitions()

--Yakov

2018-01-17 14:36 GMT-08:00 Valentin Kulichenko <
[hidden email]>:

> Folks,
>
> Our PartitionLossPolicy allows to disable operations on lost partitions,
> however all available policies allow any operations on partitions that were
> not lost. It seems to me it can be very useful to also have a policy that
> completely blocks the cache in case of data loss. Is it possible to add
> one?
>
> And as a side question: what is the difference between READ_WRITE_ALL and
> IGNORE policies? Looks like both allow both read and write on all
> partitions.
>
> -Val
>

Alexey Goncharuk

Re: Partition loss policy to disable cache completely

Valentin,

I am ok with having a policy which prohibits all cache operations, and this
is not very hard to implement. Although, I agree with Yakov - I do not see
any point in reducing cluster availability when operations can be safely
completed.

2018-01-23 2:22 GMT+03:00 Yakov Zhdanov <[hidden email]>:

> Val,
>
> Your suggestion to prohibit any cache operation on partition loss does not
> make sense to me. Why should I care about some partition during particular
> operation if I don't access it? Imagine I use data on nodes A and B
> performing reads and writes and node C crashes in the middle of tx. Should
> my tx be rolled back? I think no.
>
> As far as difference it seems that IGNORE resets lost status for affected
> partitions and READ_WRITE_ALL does not.
>
> * @see Ignite#resetLostPartitions(Collection)
> * @see IgniteCache#lostPartitions()
>
> --Yakov
>
> 2018-01-17 14:36 GMT-08:00 Valentin Kulichenko <
> [hidden email]>:
>
> > Folks,
> >
> > Our PartitionLossPolicy allows to disable operations on lost partitions,
> > however all available policies allow any operations on partitions that
> were
> > not lost. It seems to me it can be very useful to also have a policy that
> > completely blocks the cache in case of data loss. Is it possible to add
> > one?
> >
> > And as a side question: what is the difference between READ_WRITE_ALL and
> > IGNORE policies? Looks like both allow both read and write on all
> > partitions.
> >
> > -Val
> >
>

yzhdanov

Re: Partition loss policy to disable cache completely

Alex, I am against reducing cluster operation. I tried to explain in the
prev email that it is impossible to have consistent approach here. You can
prohibit operations only after exchange completes. However, in this case
plenty of transactions are committed on previous cache topology having
nodes they do not touch crashed/left the grid.

--Yakov

2018-01-23 9:28 GMT-08:00 Alexey Goncharuk <[hidden email]>:

> Valentin,
>
> I am ok with having a policy which prohibits all cache operations, and this
> is not very hard to implement. Although, I agree with Yakov - I do not see
> any point in reducing cluster availability when operations can be safely
> completed.
>
> 2018-01-23 2:22 GMT+03:00 Yakov Zhdanov <[hidden email]>:
>
> > Val,
> >
> > Your suggestion to prohibit any cache operation on partition loss does
> not
> > make sense to me. Why should I care about some partition during
> particular
> > operation if I don't access it? Imagine I use data on nodes A and B
> > performing reads and writes and node C crashes in the middle of tx.
> Should
> > my tx be rolled back? I think no.
> >
> > As far as difference it seems that IGNORE resets lost status for affected
> > partitions and READ_WRITE_ALL does not.
> >
> > * @see Ignite#resetLostPartitions(Collection)
> > * @see IgniteCache#lostPartitions()
> >
> > --Yakov
> >
> > 2018-01-17 14:36 GMT-08:00 Valentin Kulichenko <
> > [hidden email]>:
> >
> > > Folks,
> > >
> > > Our PartitionLossPolicy allows to disable operations on lost
> partitions,
> > > however all available policies allow any operations on partitions that
> > were
> > > not lost. It seems to me it can be very useful to also have a policy
> that
> > > completely blocks the cache in case of data loss. Is it possible to add
> > > one?
> > >
> > > And as a side question: what is the difference between READ_WRITE_ALL
> and
> > > IGNORE policies? Looks like both allow both read and write on all
> > > partitions.
> > >
> > > -Val
> > >
> >
>

Valentin Kulichenko

Re: Partition loss policy to disable cache completely

Yakov,

I still think there are valid use cases. From the top of my head - what if
one wants to iterate through multiple partitions and do some calculations?
Locking and transactional semantics are not needed, but if some of the data
is LOST, computation should fail, and new computations should not even
start. Basically, you assume that if two entries are stored in different
partitions and not accessed in same transaction, then these entries are
completely unrelated to each other. From my expirience, this assumption is
incorrect.

-Val

On Tue, Jan 23, 2018 at 11:03 AM, Yakov Zhdanov <[hidden email]> wrote:

> Alex, I am against reducing cluster operation. I tried to explain in the
> prev email that it is impossible to have consistent approach here. You can
> prohibit operations only after exchange completes. However, in this case
> plenty of transactions are committed on previous cache topology having
> nodes they do not touch crashed/left the grid.
>
> --Yakov
>
> 2018-01-23 9:28 GMT-08:00 Alexey Goncharuk <[hidden email]>:
>
> > Valentin,
> >
> > I am ok with having a policy which prohibits all cache operations, and
> this
> > is not very hard to implement. Although, I agree with Yakov - I do not
> see
> > any point in reducing cluster availability when operations can be safely
> > completed.
> >
> > 2018-01-23 2:22 GMT+03:00 Yakov Zhdanov <[hidden email]>:
> >
> > > Val,
> > >
> > > Your suggestion to prohibit any cache operation on partition loss does
> > not
> > > make sense to me. Why should I care about some partition during
> > particular
> > > operation if I don't access it? Imagine I use data on nodes A and B
> > > performing reads and writes and node C crashes in the middle of tx.
> > Should
> > > my tx be rolled back? I think no.
> > >
> > > As far as difference it seems that IGNORE resets lost status for
> affected
> > > partitions and READ_WRITE_ALL does not.
> > >
> > > * @see Ignite#resetLostPartitions(Collection)
> > > * @see IgniteCache#lostPartitions()
> > >
> > > --Yakov
> > >
> > > 2018-01-17 14:36 GMT-08:00 Valentin Kulichenko <
> > > [hidden email]>:
> > >
> > > > Folks,
> > > >
> > > > Our PartitionLossPolicy allows to disable operations on lost
> > partitions,
> > > > however all available policies allow any operations on partitions
> that
> > > were
> > > > not lost. It seems to me it can be very useful to also have a policy
> > that
> > > > completely blocks the cache in case of data loss. Is it possible to
> add
> > > > one?
> > > >
> > > > And as a side question: what is the difference between READ_WRITE_ALL
> > and
> > > > IGNORE policies? Looks like both allow both read and write on all
> > > > partitions.
> > > >
> > > > -Val
> > > >
> > >
> >
>

yzhdanov

Re: Partition loss policy to disable cache completely

Val, your computation fails once it reaches the absent partition. Agree
with the point that any new computation should not start. Guys, any ideas
on how to achieve that? I would think of scan/sql query checking that there
is no data loss on current topology version prior to start. Val, please
note that along with queries that require full data set there can be some
operations that require only limited partitions (most probably only 1). So,
no point in such strict limitations. Agree?

--Yakov

dsetrakyan

Re: Partition loss policy to disable cache completely

Why not just add a new policy as Val suggested?

⁣D.

On Jan 23, 2018, 4:44 PM, at 4:44 PM, Yakov Zhdanov <[hidden email]> wrote:

>Val, your computation fails once it reaches the absent partition. Agree
>with the point that any new computation should not start. Guys, any
>ideas
>on how to achieve that? I would think of scan/sql query checking that
>there
>is no data loss on current topology version prior to start. Val, please
>note that along with queries that require full data set there can be
>some
>operations that require only limited partitions (most probably only 1).
>So,
>no point in such strict limitations. Agree?
>
>--Yakov

yzhdanov

Re: Partition loss policy to disable cache completely

I'm still not sure on what Val has suggested. Dmitry, Val, Do you have any
concrete API/algorithm in mind?

--Yakov

Valentin Kulichenko

Re: Partition loss policy to disable cache completely

Yakov,

My suggestion is to introduce new policy (e.g. READ_WRITE_NONE) which works
in the same way as READ_WRITE_SAFE but for all partitions, not only lost
ones. Any operation attempted on a topology version which has at least one
lost partition, should throw an exception. Will this work?

-Val

On Tue, Jan 23, 2018 at 5:20 PM, Yakov Zhdanov <[hidden email]> wrote:

> I'm still not sure on what Val has suggested. Dmitry, Val, Do you have any
> concrete API/algorithm in mind?
>
> --Yakov
>

yzhdanov

Re: Partition loss policy to disable cache completely

No. This is not 100% consistent. Since operations started on prev version
after node has left (but system has not got event yet) would succeed. For
me consistent behavior is to throw exception for "select avg(x) from bla"
if data is currently missing or any data loss occurs in the middle of the
query and return result for cache.get(key); if partition for that key is
still in the grid.

--Yakov