CRITICAL - NATIVE PERSISTENCE: Cache data is destroyed after disable WAL and restart

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

CRITICAL - NATIVE PERSISTENCE: Cache data is destroyed after disable WAL and restart

Manu
This post was updated on .
Hi!

I have a question, is it the expected behaviour that when WAL is manually deactivated for a persisted
cache if server node is restarted, the persisted content of the
cache is completely destroyed?

We need to disable WAL for large heavy ingestion processes to improve performance as it’s recommended, but eventually ingestion may fail (OS, machine crash), so WAL state is not re-enabled. On this situation when we restart server nodes, cache’s persistent directory is deleted and recreated again, so data is lost.

This behaviour was introduced in this new feature: Ability to disable WAL (Non recoverable case)

Regards.

This is the method that does this hell thing
org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.beforeCacheGroupStart

Process to replicate it:

Ignite version 2.7.0

1. Start one or more server nodes with native persistence enabled
2. Create a cache (natively persisted) and store some data
3. Disable WAL for cache - ignite().cluster().disableWal("TheCacheName")
4. Restart server/s nodes
5. Check cache directory was deleted and recreated again, all data was lost.
6. Not only that, try store some data and restart, you will see all data was lost. In general data will be lost after restart servers until WAL was re-enabled again.

Call stack on server node start:
*org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.beforeCacheGroupStart*
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.registerCacheGroup
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.registerNewCache
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.processJoiningNode
org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onStart
*org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnStart*
org.apache.ignite.internal.processors.cache.GridCacheProcessor.onReadyForRead
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetastorageReadyForRead
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead
org.apache.ignite.internal.IgniteKernal.start
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0
org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start
org.apache.ignite.internal.IgnitionEx.start0
org.apache.ignite.internal.IgnitionEx.startConfigurations
org.apache.ignite.internal.IgnitionEx.start
org.apache.ignite.internal.IgnitionEx.start
org.apache.ignite.internal.IgnitionEx.start
org.apache.ignite.internal.IgnitionEx.start
org.apache.ignite.Ignition.start
org.apache.ignite.startup.cmdline.CommandLineStartup.main
Reply | Threaded
Open this post in threaded view
|

Re: NATIVE PERSISTENCE: Cache data is destroyed after disable WAL and restart

Mmuzaf
Hello Manu, sorry for the delay.

By disabling WAL for particular cache group you remove any recovery
guarantees in case of node fail for it. It is a normal trade-off
between data loading speed and further recovery. For instance, the
checkpoint thread (a thread which flushes cache data to disk) fails in
the middle its write process we will have two options: start node with
corrupted data (since WAL is disabled and there is no chance to
recover it) and clear all the cache data previously written. It is
better to choose the second option.

What do you think?

On Sun, 28 Apr 2019 at 17:05, Manu <[hidden email]> wrote:

>
> Hi!
>
> I have a question, is it normal that if WAL is deactivated for a persisted
> cache when the server node(s) is restarted, the persisted content of the
> cache is completely destroyed?
>
> I need to disable WAL for large heavy ingestion processes, but eventually
> ingestion may fail (OS, machine crash), so WAL state is not re-enabled after
> ingestion. On this situation if I restart a server node, cache persistent
> directory is deleted and recreated again, so data is lost.
>
> Thanks!
>
> This is the method that do this hell thing
> org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.beforeCacheGroupStart
>
> Process to replicate it:
>
> 1. Start one or more server nodes with native persistence enabled
> 2. Create a cache (natively persisted) and store some data
> 3. Disable WAL for cache - ignite().cluster().disableWal("TheCacheName")
> 4. Restart server/s nodes
> 5. Check cache directory was deleted and recreated again, all data was lost.
>
> Call stack on server node start:
> *org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.beforeCacheGroupStart*
> org.apache.ignite.internal.processors.cache.ClusterCachesInfo.registerCacheGroup
> org.apache.ignite.internal.processors.cache.ClusterCachesInfo.registerNewCache
> org.apache.ignite.internal.processors.cache.ClusterCachesInfo.processJoiningNode
> org.apache.ignite.internal.processors.cache.ClusterCachesInfo.onStart
> *org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCachesOnStart*
> org.apache.ignite.internal.processors.cache.GridCacheProcessor.onReadyForRead
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetastorageReadyForRead
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.readMetastore
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager.notifyMetaStorageSubscribersOnReadyForRead
> org.apache.ignite.internal.IgniteKernal.start
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start
> org.apache.ignite.internal.IgnitionEx.start0
> org.apache.ignite.internal.IgnitionEx.startConfigurations
> org.apache.ignite.internal.IgnitionEx.start
> org.apache.ignite.internal.IgnitionEx.start
> org.apache.ignite.internal.IgnitionEx.start
> org.apache.ignite.internal.IgnitionEx.start
> org.apache.ignite.Ignition.start
> org.apache.ignite.startup.cmdline.CommandLineStartup.main
>
> Ignite version 2.7.0
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/