Partition loss policy - how to use?

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Partition loss policy - how to use?

Valentin Kulichenko
Folks,

Since 2.0 we have introduced PartitionLossPolicy which blocks access to
cache if data loss occurred. This is an awesome feature, however it is not
very clear how to use it properly.

First of all, there is no documentation. Ticket already exists though and
hopefully it will be completed soon:
https://issues.apache.org/jira/browse/IGNITE-6994

Second of all, looks like there is no required tooling. Visor and Web
Console should be able to show the status (i.e. which partitions are
available and which are not), fire alerts in case of partition loss,
provide an ability to restore partitions via Ignite#resetLostPartitions
method, etc.

And finally (most importantly), I'm a bit confused
by Ignite#resetLostPartitions method itself. What are the best practices
for using it? How a user should decide that partitions are actually
restored and how should he choose when to call this method? For example, if
we have persistence enabled, is it enough to just bring back the nodes or
something else is needed? Actually, why don't we detect this automatically
in this scenario?

I would appreciate any inputs and thoughts on this topic.

-val
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

Alexey Kuznetsov
Val,

I'm not an expert in how caches subsystem works internally,
 but  I think it is a good idea to add this info and tooling to Web Console
/ Visor.

Could you create issues with description on what part of Web Console /
Visor it could be added.
I guess for Visor CMD  it could be some mode for "cache"  command I think.


On Fri, Dec 15, 2017 at 8:54 AM, Valentin Kulichenko <
[hidden email]> wrote:

> Folks,
>
> Since 2.0 we have introduced PartitionLossPolicy which blocks access to
> cache if data loss occurred. This is an awesome feature, however it is not
> very clear how to use it properly.
>
> First of all, there is no documentation. Ticket already exists though and
> hopefully it will be completed soon:
> https://issues.apache.org/jira/browse/IGNITE-6994
>
> Second of all, looks like there is no required tooling. Visor and Web
> Console should be able to show the status (i.e. which partitions are
> available and which are not), fire alerts in case of partition loss,
> provide an ability to restore partitions via Ignite#resetLostPartitions
> method, etc.
>
> And finally (most importantly), I'm a bit confused
> by Ignite#resetLostPartitions method itself. What are the best practices
> for using it? How a user should decide that partitions are actually
> restored and how should he choose when to call this method? For example, if
> we have persistence enabled, is it enough to just bring back the nodes or
> something else is needed? Actually, why don't we detect this automatically
> in this scenario?
>
> I would appreciate any inputs and thoughts on this topic.
>
> -val
>



--
Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

dmagda
Alex K., thanks for stepping in.

It will be vivid what to do on the tooling side once it’s clear how to use the feature in general.

Alex G., Sam, Yakov, could you comment on this point below?

>> And finally (most importantly), I'm a bit confused
>> by Ignite#resetLostPartitions method itself. What are the best practices
>> for using it? How a user should decide that partitions are actually
>> restored and how should he choose when to call this method? For example, if
>> we have persistence enabled, is it enough to just bring back the nodes or
>> something else is needed? Actually, why don't we detect this automatically
>> in this scenario?


Denis

> On Dec 19, 2017, at 2:01 AM, Alexey Kuznetsov <[hidden email]> wrote:
>
> Val,
>
> I'm not an expert in how caches subsystem works internally,
> but  I think it is a good idea to add this info and tooling to Web Console
> / Visor.
>
> Could you create issues with description on what part of Web Console /
> Visor it could be added.
> I guess for Visor CMD  it could be some mode for "cache"  command I think.
>
>
> On Fri, Dec 15, 2017 at 8:54 AM, Valentin Kulichenko <
> [hidden email]> wrote:
>
>> Folks,
>>
>> Since 2.0 we have introduced PartitionLossPolicy which blocks access to
>> cache if data loss occurred. This is an awesome feature, however it is not
>> very clear how to use it properly.
>>
>> First of all, there is no documentation. Ticket already exists though and
>> hopefully it will be completed soon:
>> https://issues.apache.org/jira/browse/IGNITE-6994
>>
>> Second of all, looks like there is no required tooling. Visor and Web
>> Console should be able to show the status (i.e. which partitions are
>> available and which are not), fire alerts in case of partition loss,
>> provide an ability to restore partitions via Ignite#resetLostPartitions
>> method, etc.
>>
>> And finally (most importantly), I'm a bit confused
>> by Ignite#resetLostPartitions method itself. What are the best practices
>> for using it? How a user should decide that partitions are actually
>> restored and how should he choose when to call this method? For example, if
>> we have persistence enabled, is it enough to just bring back the nodes or
>> something else is needed? Actually, why don't we detect this automatically
>> in this scenario?
>>
>> I would appreciate any inputs and thoughts on this topic.
>>
>> -val
>>
>
>
>
> --
> Alexey Kuznetsov

Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

Valentin Kulichenko
Couple more questions:
- Is there any way to know which partitions in which caches are currently
unavailable?
- How can I check which nodes failed to trigger the data loss?

The only way I see is to listen for EVT_CACHE_REBALANCE_PART_DATA_LOST
event... Probably we should add something more obvious to the API?

-Val

On Tue, Dec 19, 2017 at 11:41 AM, Denis Magda <[hidden email]> wrote:

> Alex K., thanks for stepping in.
>
> It will be vivid what to do on the tooling side once it’s clear how to use
> the feature in general.
>
> Alex G., Sam, Yakov, could you comment on this point below?
>
> >> And finally (most importantly), I'm a bit confused
> >> by Ignite#resetLostPartitions method itself. What are the best practices
> >> for using it? How a user should decide that partitions are actually
> >> restored and how should he choose when to call this method? For
> example, if
> >> we have persistence enabled, is it enough to just bring back the nodes
> or
> >> something else is needed? Actually, why don't we detect this
> automatically
> >> in this scenario?
>
> —
> Denis
>
> > On Dec 19, 2017, at 2:01 AM, Alexey Kuznetsov <[hidden email]>
> wrote:
> >
> > Val,
> >
> > I'm not an expert in how caches subsystem works internally,
> > but  I think it is a good idea to add this info and tooling to Web
> Console
> > / Visor.
> >
> > Could you create issues with description on what part of Web Console /
> > Visor it could be added.
> > I guess for Visor CMD  it could be some mode for "cache"  command I
> think.
> >
> >
> > On Fri, Dec 15, 2017 at 8:54 AM, Valentin Kulichenko <
> > [hidden email]> wrote:
> >
> >> Folks,
> >>
> >> Since 2.0 we have introduced PartitionLossPolicy which blocks access to
> >> cache if data loss occurred. This is an awesome feature, however it is
> not
> >> very clear how to use it properly.
> >>
> >> First of all, there is no documentation. Ticket already exists though
> and
> >> hopefully it will be completed soon:
> >> https://issues.apache.org/jira/browse/IGNITE-6994
> >>
> >> Second of all, looks like there is no required tooling. Visor and Web
> >> Console should be able to show the status (i.e. which partitions are
> >> available and which are not), fire alerts in case of partition loss,
> >> provide an ability to restore partitions via Ignite#resetLostPartitions
> >> method, etc.
> >>
> >> And finally (most importantly), I'm a bit confused
> >> by Ignite#resetLostPartitions method itself. What are the best practices
> >> for using it? How a user should decide that partitions are actually
> >> restored and how should he choose when to call this method? For
> example, if
> >> we have persistence enabled, is it enough to just bring back the nodes
> or
> >> something else is needed? Actually, why don't we detect this
> automatically
> >> in this scenario?
> >>
> >> I would appreciate any inputs and thoughts on this topic.
> >>
> >> -val
> >>
> >
> >
> >
> > --
> > Alexey Kuznetsov
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

gauravhb
Hi,

Is there any update on this topic?
Any tickets created for points mentioned by Valentin?

Thanks.



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

dmagda
Hi,

Here is documentation we prepared for 2.4 release:
https://apacheignite.readme.io/v2.3/docs/cache-modes-24#partition-loss-policies

It's hidden for now and will become visible to everyone once Ignite 2.4
vote passes (in progress).

--
Denis

On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]> wrote:

> Hi,
>
> Is there any update on this topic?
> Any tickets created for points mentioned by Valentin?
>
> Thanks.
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

dmagda
For those interested, here is a doc we put together for the partition
policies which considers extra improvements released in 2.4:
https://apacheignite.readme.io/v2.4/docs/partition-loss-policies

--
Denis

On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]> wrote:

> Hi,
>
> Here is documentation we prepared for 2.4 release: https://apacheignite.
> readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
>
> It's hidden for now and will become visible to everyone once Ignite 2.4
> vote passes (in progress).
>
> --
> Denis
>
> On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]> wrote:
>
>> Hi,
>>
>> Is there any update on this topic?
>> Any tickets created for points mentioned by Valentin?
>>
>> Thanks.
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

gauravhb
Hi Denis,
Thanks. Document certainly looks useful. Do we have ticket for improvement
in Webconsole/Visor for marking resetLostPartitions()?


Regards,
Gaurav

On 13-Mar-2018 7:42 PM, "Denis Magda" <[hidden email]> wrote:

For those interested, here is a doc we put together for the partition
policies which considers extra improvements released in 2.4:
https://apacheignite.readme.io/v2.4/docs/partition-loss-policies

--
Denis

On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]> wrote:

> Hi,
>
> Here is documentation we prepared for 2.4 release: https://apacheignite.
> readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
>
> It's hidden for now and will become visible to everyone once Ignite 2.4
> vote passes (in progress).
>
> --
> Denis
>
> On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]> wrote:
>
>> Hi,
>>
>> Is there any update on this topic?
>> Any tickets created for points mentioned by Valentin?
>>
>> Thanks.
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

dmagda
Hi Gaurav,

I'm not sure about that, but it sounds like a right addition anyway.

Alex K., could you chime in please and clarify if the tools somehow support
the loss policies?

--
Denis


On Tue, Mar 13, 2018 at 11:44 AM, Gaurav Bajaj <[hidden email]>
wrote:

> Hi Denis,
> Thanks. Document certainly looks useful. Do we have ticket for improvement
> in Webconsole/Visor for marking resetLostPartitions()?
>
>
> Regards,
> Gaurav
>
> On 13-Mar-2018 7:42 PM, "Denis Magda" <[hidden email]> wrote:
>
> For those interested, here is a doc we put together for the partition
> policies which considers extra improvements released in 2.4:
> https://apacheignite.readme.io/v2.4/docs/partition-loss-policies
>
> --
> Denis
>
> On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]> wrote:
>
> > Hi,
> >
> > Here is documentation we prepared for 2.4 release: https://apacheignite.
> > readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
> >
> > It's hidden for now and will become visible to everyone once Ignite 2.4
> > vote passes (in progress).
> >
> > --
> > Denis
> >
> > On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> Is there any update on this topic?
> >> Any tickets created for points mentioned by Valentin?
> >>
> >> Thanks.
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

Alexey Kuznetsov
In reply to this post by gauravhb
 Gaurav,

I think it make sense to add this for tools.
Created issue: https://issues.apache.org/jira/browse/IGNITE-7940

On Wed, Mar 14, 2018 at 1:44 AM, Gaurav Bajaj <[hidden email]>
wrote:

> Hi Denis,
> Thanks. Document certainly looks useful. Do we have ticket for improvement
> in Webconsole/Visor for marking resetLostPartitions()?
>
>
> Regards,
> Gaurav
>
> On 13-Mar-2018 7:42 PM, "Denis Magda" <[hidden email]> wrote:
>
> For those interested, here is a doc we put together for the partition
> policies which considers extra improvements released in 2.4:
> https://apacheignite.readme.io/v2.4/docs/partition-loss-policies
>
> --
> Denis
>
> On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]> wrote:
>
> > Hi,
> >
> > Here is documentation we prepared for 2.4 release: https://apacheignite.
> > readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
> >
> > It's hidden for now and will become visible to everyone once Ignite 2.4
> > vote passes (in progress).
> >
> > --
> > Denis
> >
> > On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]> wrote:
> >
> >> Hi,
> >>
> >> Is there any update on this topic?
> >> Any tickets created for points mentioned by Valentin?
> >>
> >> Thanks.
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >>
> >
> >
>



--
Alexey Kuznetsov
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

gauravhb
Alexey,

Thanks. I wonder why not webconsole?

Thanks,
Gaurav

On 14-Mar-2018 1:28 AM, "Alexey Kuznetsov" <[hidden email]> wrote:

>  Gaurav,
>
> I think it make sense to add this for tools.
> Created issue: https://issues.apache.org/jira/browse/IGNITE-7940
>
> On Wed, Mar 14, 2018 at 1:44 AM, Gaurav Bajaj <[hidden email]>
> wrote:
>
> > Hi Denis,
> > Thanks. Document certainly looks useful. Do we have ticket for
> improvement
> > in Webconsole/Visor for marking resetLostPartitions()?
> >
> >
> > Regards,
> > Gaurav
> >
> > On 13-Mar-2018 7:42 PM, "Denis Magda" <[hidden email]> wrote:
> >
> > For those interested, here is a doc we put together for the partition
> > policies which considers extra improvements released in 2.4:
> > https://apacheignite.readme.io/v2.4/docs/partition-loss-policies
> >
> > --
> > Denis
> >
> > On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]> wrote:
> >
> > > Hi,
> > >
> > > Here is documentation we prepared for 2.4 release:
> https://apacheignite.
> > > readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
> > >
> > > It's hidden for now and will become visible to everyone once Ignite 2.4
> > > vote passes (in progress).
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]>
> wrote:
> > >
> > >> Hi,
> > >>
> > >> Is there any update on this topic?
> > >> Any tickets created for points mentioned by Valentin?
> > >>
> > >> Thanks.
> > >>
> > >>
> > >>
> > >> --
> > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > >>
> > >
> > >
> >
>
>
>
> --
> Alexey Kuznetsov
>
Reply | Threaded
Open this post in threaded view
|

Re: Partition loss policy - how to use?

dmagda
The monitoring screen that should show the partition loss metrics is not in
open source. It's a plugin GridGain built on top of the console foundation.

However, we can use it for free via that deployment even for Ignite
clusters: https://console.gridgain.com/

--
Denis

On Wed, Mar 14, 2018 at 8:29 PM, Gaurav Bajaj <[hidden email]>
wrote:

> Alexey,
>
> Thanks. I wonder why not webconsole?
>
> Thanks,
> Gaurav
>
> On 14-Mar-2018 1:28 AM, "Alexey Kuznetsov" <[hidden email]> wrote:
>
> >  Gaurav,
> >
> > I think it make sense to add this for tools.
> > Created issue: https://issues.apache.org/jira/browse/IGNITE-7940
> >
> > On Wed, Mar 14, 2018 at 1:44 AM, Gaurav Bajaj <[hidden email]>
> > wrote:
> >
> > > Hi Denis,
> > > Thanks. Document certainly looks useful. Do we have ticket for
> > improvement
> > > in Webconsole/Visor for marking resetLostPartitions()?
> > >
> > >
> > > Regards,
> > > Gaurav
> > >
> > > On 13-Mar-2018 7:42 PM, "Denis Magda" <[hidden email]> wrote:
> > >
> > > For those interested, here is a doc we put together for the partition
> > > policies which considers extra improvements released in 2.4:
> > > https://apacheignite.readme.io/v2.4/docs/partition-loss-policies
> > >
> > > --
> > > Denis
> > >
> > > On Tue, Mar 6, 2018 at 11:19 AM, Denis Magda <[hidden email]>
> wrote:
> > >
> > > > Hi,
> > > >
> > > > Here is documentation we prepared for 2.4 release:
> > https://apacheignite.
> > > > readme.io/v2.3/docs/cache-modes-24#partition-loss-policies
> > > >
> > > > It's hidden for now and will become visible to everyone once Ignite
> 2.4
> > > > vote passes (in progress).
> > > >
> > > > --
> > > > Denis
> > > >
> > > > On Tue, Mar 6, 2018 at 6:59 AM, gauravhb <[hidden email]>
> > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> Is there any update on this topic?
> > > >> Any tickets created for points mentioned by Valentin?
> > > >>
> > > >> Thanks.
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >>
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Alexey Kuznetsov
> >
>