Apache Ignite Developers - Legacy Mail Archive

Various shutdown guaranties

Classic

List

Threaded

4 messages Options

V.Pyatkov

Various shutdown guaranties

Hi

We need to have ability to calling shutdown with various guaranties.
For example:
Need to reboot a node, but after that node should be available for
historical rebalance (all partitions in MOVING state should have gone to
OWNING).

Implemented a circled reboot of cluster, but all data should be available on
that time (at least one copy of partition should be available in cluster).

Need to wait not only data available, but all jobs (before this behavior
available through a stop(false) method invocation).

All these reason required various behavior before shutting down node.
I propose slightly modify public API and add here method which shown on
shutdown behavior directly:
Ignite.close(Shutdown)

/public enum Shutdownn {
/**
* Stop immediately as soon components are ready.
*/
IMMEDIATE,
/**
* Stop node when all partitions completed moving from/to this node to
another.
*/
NORMAL,
/**
* Node will stop if and only if it does not store any unique
partitions, that does not have copies on cluster.
*/
GRACEFUL,
/**
* Node stops graceful and wait all jobs before shutdown.
*/
ALL
}/

Method close without parameter Ignite.close() will get shutdown behavior
configured for cluster wide. It will be implemented through distributed meta
storage and additional utilities for configuration.
Also, will be added a method to configure shutdown on start, this is look as
IgniteConfiguration.setShutdown(Shutdown).
If shutting down did not configure all be worked as before according to
IMMEDIATE behavior.
All other close method will be marked as deprecated.

I will be waiting for your opinions.

--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/

Alexei Scherbakov

Re: Various shutdown guaranties

V.Pyatkov

While I agree we need a way to prevent unintentional data loss on shutdown,
I do not like the proposed shutdown flags enum.
I see no relation between possible data loss on shutdown and waiting for
some jobs to complete.

All we need is a new method (duplicated by system property), like

IgniteConfiguration.setShutdownPolicy(GRACEFUL|DEFAULT);
and an optional
IgniteConfiguration.setGracefulShutdownTimeout(long); // Force a shutdown
if the timeout is expired.

For enabled graceful policy a node shouldn't normally stop if it is the
last owner for any partition.
This will prevent unintentional data loss on stop when it is possible, for
example if a grid is deployed over kubernetes.

The properties also should be changeable at runtime using JMX or control.sh
interface.

пн, 8 июн. 2020 г. в 13:46, V.Pyatkov <[hidden email]>:

> Hi
>
> We need to have ability to calling shutdown with various guaranties.
> For example:
> Need to reboot a node, but after that node should be available for
> historical rebalance (all partitions in MOVING state should have gone to
> OWNING).
>
> Implemented a circled reboot of cluster, but all data should be available
> on
> that time (at least one copy of partition should be available in cluster).
>
> Need to wait not only data available, but all jobs (before this behavior
> available through a stop(false) method invocation).
>
> All these reason required various behavior before shutting down node.
> I propose slightly modify public API and add here method which shown on
> shutdown behavior directly:
> Ignite.close(Shutdown)
>
> /public enum Shutdownn {
> /**
> * Stop immediately as soon components are ready.
> */
> IMMEDIATE,
> /**
> * Stop node when all partitions completed moving from/to this node to
> another.
> */
> NORMAL,
> /**
> * Node will stop if and only if it does not store any unique
> partitions, that does not have copies on cluster.
> */
> GRACEFUL,
> /**
> * Node stops graceful and wait all jobs before shutdown.
> */
> ALL
> }/
>
> Method close without parameter Ignite.close() will get shutdown behavior
> configured for cluster wide. It will be implemented through distributed
> meta
> storage and additional utilities for configuration.
> Also, will be added a method to configure shutdown on start, this is look
> as
> IgniteConfiguration.setShutdown(Shutdown).
> If shutting down did not configure all be worked as before according to
> IMMEDIATE behavior.
> All other close method will be marked as deprecated.
>
> I will be waiting for your opinions.
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>

--

Best regards,
Alexei Scherbakov

Alexei Scherbakov

Re: Various shutdown guaranties

Graceful policy should only be applicable to caches having a number of
backups > 0.

пн, 8 июн. 2020 г. в 14:54, Alexei Scherbakov <[hidden email]
>:

> V.Pyatkov
>
>
> While I agree we need a way to prevent unintentional data loss on
> shutdown, I do not like the proposed shutdown flags enum.
> I see no relation between possible data loss on shutdown and waiting for
> some jobs to complete.
>
> All we need is a new method (duplicated by system property), like
>
> IgniteConfiguration.setShutdownPolicy(GRACEFUL|DEFAULT);
> and an optional
> IgniteConfiguration.setGracefulShutdownTimeout(long); // Force a shutdown
> if the timeout is expired.
>
> For enabled graceful policy a node shouldn't normally stop if it is the
> last owner for any partition.
> This will prevent unintentional data loss on stop when it is possible, for
> example if a grid is deployed over kubernetes.
>
> The properties also should be changeable at runtime using JMX or
> control.sh interface.
>
>
>
>
> пн, 8 июн. 2020 г. в 13:46, V.Pyatkov <[hidden email]>:
>
>> Hi
>>
>> We need to have ability to calling shutdown with various guaranties.
>> For example:
>> Need to reboot a node, but after that node should be available for
>> historical rebalance (all partitions in MOVING state should have gone to
>> OWNING).
>>
>> Implemented a circled reboot of cluster, but all data should be available
>> on
>> that time (at least one copy of partition should be available in cluster).
>>
>> Need to wait not only data available, but all jobs (before this behavior
>> available through a stop(false) method invocation).
>>
>> All these reason required various behavior before shutting down node.
>> I propose slightly modify public API and add here method which shown on
>> shutdown behavior directly:
>> Ignite.close(Shutdown)
>>
>> /public enum Shutdownn {
>> /**
>> * Stop immediately as soon components are ready.
>> */
>> IMMEDIATE,
>> /**
>> * Stop node when all partitions completed moving from/to this node to
>> another.
>> */
>> NORMAL,
>> /**
>> * Node will stop if and only if it does not store any unique
>> partitions, that does not have copies on cluster.
>> */
>> GRACEFUL,
>> /**
>> * Node stops graceful and wait all jobs before shutdown.
>> */
>> ALL
>> }/
>>
>> Method close without parameter Ignite.close() will get shutdown behavior
>> configured for cluster wide. It will be implemented through distributed
>> meta
>> storage and additional utilities for configuration.
>> Also, will be added a method to configure shutdown on start, this is look
>> as
>> IgniteConfiguration.setShutdown(Shutdown).
>> If shutting down did not configure all be worked as before according to
>> IMMEDIATE behavior.
>> All other close method will be marked as deprecated.
>>
>> I will be waiting for your opinions.
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>

--

Best regards,
Alexei Scherbakov

V.Pyatkov

Re: Various shutdown guaranties

> I see no relation between possible data loss on shutdown and waiting for
> some jobs to complete.

Yes, I do not think that is.
More over that I want to union waiting of jobs and waiting rebalance in one
shutdown policy. Rather, these various shutdown types mean various types of
waiting.
IMMEDIATE - nothing to wait, simply calling stop on components.
NORMAL - wait all what we can on local node before calling close.
GRACEFUL - More strict, in additional we will wait some distributed
conditions (for example waiting of nodes with copies of local partitions).

No more over that. I will prefer exclude ALL.
enum Shutdownn {
IMMEDIATE,
NORMAL,
GRACEFUL
}

> IgniteConfiguration.setGracefulShutdownTimeout(long); // Force a shutdown

I suppose that it is available manually. If client consider that node is
stopping is very long, it can to invoke Ignite.close(IMMEDIATE) (or kill -9
<PID>).

> IgniteConfiguration.setShutdownPolicy(GRACEFUL|DEFAULT);

That does not allow deciding shutdown policy in runtime. If it normally we
can to hide method stop(Shutdown) in internal API.

> The properties also should be changeable at runtime using JMX or
> control.sh interface.

Before I was sure the shutdown type will be stored in cluster, but our
implicit policy says: "All properties which configured through JMX will not
save after restart."

--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/