[jira] [Created] (IGNITE-11743) Stopping caches concurrently with node join may lead to crash of the node

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-11743) Stopping caches concurrently with node join may lead to crash of the node

Anton Vinogradov (Jira)
Sergey Chugunov created IGNITE-11743:
----------------------------------------

             Summary: Stopping caches concurrently with node join may lead to crash of the node
                 Key: IGNITE-11743
                 URL: https://issues.apache.org/jira/browse/IGNITE-11743
             Project: Ignite
          Issue Type: Bug
    Affects Versions: 2.7
            Reporter: Sergey Chugunov
            Assignee: Sergey Chugunov
         Attachments: IgnitePdsNodeRestartCacheCreateTest.java

When an existing cache is stopped (e.g. via call Ignite#destroyCache(String name)) this action is distributed across cluster by discovery mechanism (and is processed from *disco-notifier-worker* thread).
At the same time joining node prepares to start caches from *exchange-thread*.

If a cache stop request arrives to new node right in the middle of cache start prepare, it may lead to exception in FilePageStoreManager like one below and node crash.

Test reproducing the issue is attached.

{noformat}
class org.apache.ignite.IgniteCheckedException: Failed to get page store for the given cache ID (cache has not been started): -1422502786
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.getStore(FilePageStoreManager.java:1132)
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:482)
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:469)
        at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:854)
        at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:681)
        at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.getOrAllocateCacheMetas(GridCacheOffheapManager.java:869)
        at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.initDataStructures(GridCacheOffheapManager.java:128)
        at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.start(IgniteCacheOffheapManagerImpl.java:193)
        at org.apache.ignite.internal.processors.cache.CacheGroupContext.start(CacheGroupContext.java:1043)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.startCacheGroup(GridCacheProcessor.java:2829)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.getOrCreateCacheGroupContext(GridCacheProcessor.java:2557)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.prepareCacheContext(GridCacheProcessor.java:2387)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$null$6a5b31b9$1(GridCacheProcessor.java:2209)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$5(GridCacheProcessor.java:2130)
        at org.apache.ignite.internal.processors.cache.GridCacheProcessor.lambda$prepareStartCaches$926b6886$1(GridCacheProcessor.java:2206)
        at org.apache.ignite.internal.util.IgniteUtils.lambda$null$1(IgniteUtils.java:10874)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)