[jira] [Created] (IGNITE-6797) Handle IO errors in LFS files

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-6797) Handle IO errors in LFS files

Anton Vinogradov (Jira)
Alexander Belyak created IGNITE-6797:
----------------------------------------

             Summary: Handle IO errors in LFS files
                 Key: IGNITE-6797
                 URL: https://issues.apache.org/jira/browse/IGNITE-6797
             Project: Ignite
          Issue Type: Bug
      Security Level: Public (Viewable by anyone)
    Affects Versions: 2.1
            Reporter: Alexander Belyak
            Priority: Minor


If some thread was interrupted while IO operation with LFS file (for example - read page) then JVM close FileChannel of such file and mark it as closed by interrupt. If next thread try to load any page from closed file it get ClosedChannelException, but PageMemoryImpl first register page in segment FillPageIdTable loadedPages and didn't clear it after IO error, so third thread will find empty page in it and throw Unknown page type: 0 IgniteCheckedException.
To fix it we should try to restore FileChannel after ClosedChannelException (for first time) and stop node if we get any other exception or get some error while reopening by ClosedChannelException in FilePageStore.
Read from closed channel exception:
{noformat}
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.google.common.eventbus.EventSubscriber.handleEvent(EventSubscriber.java:74)
        at com.google.common.eventbus.EventBus.dispatch(EventBus.java:322)
        at com.google.common.eventbus.AsyncEventBus.access$001(AsyncEventBus.java:34)
        at com.google.common.eventbus.AsyncEventBus$1.run(AsyncEventBus.java:117)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ignite.IgniteCheckedException: Runtime failure on lookup row: org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@5678e76a
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070)
        at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476)
        at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276)
        at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:406)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.getAllAsync0(GridCacheAdapter.java:1902)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtAllAsync(GridDhtCacheAdapter.java:780)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.getAsync(GridDhtGetSingleFuture.java:360)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map0(GridDhtGetSingleFuture.java:254)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map(GridDhtGetSingleFuture.java:237)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.init(GridDhtGetSingleFuture.java:161)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtSingleAsync(GridDhtCacheAdapter.java:878)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetRequest(GridDhtCacheAdapter.java:892)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:131)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:129)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1562)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1190)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
        at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
        ... 1 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Read error
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:359)
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:286)
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStoreManager.read(FilePageStoreManager.java:271)
        at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:613)
        at org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:531)
        at org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:129)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1124)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1141)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1141)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1101)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1065)
        ... 25 common frames omitted
Caused by: java.nio.channels.ClosedChannelException: null
        at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:110)
        at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:721)
        at org.apache.ignite.internal.processors.cache.persistence.file.RandomAccessFileIO.read(RandomAccessFileIO.java:62)
        at org.apache.ignite.internal.processors.cache.persistence.file.FilePageStore.read(FilePageStore.java:322)
        ... 35 common frames omitted
{noformat}
Read from empty page exception:
{noformat}
        at com.sbt.core.transport.server.TransportServiceExporter.onMessage(TransportServiceExporter.java:111)
        at com.sbt.core.transport.rpc.impl.RpcExecutorImpl.handleMessage(RpcExecutorImpl.java:132)
        at com.sbt.core.transport.rpc.impl.RpcExecutorImpl.handleMessage(RpcExecutorImpl.java:147)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at com.google.common.eventbus.EventSubscriber.handleEvent(EventSubscriber.java:74)
        at com.google.common.eventbus.EventBus.dispatch(EventBus.java:322)
        at com.google.common.eventbus.AsyncEventBus.access$001(AsyncEventBus.java:34)
        at com.google.common.eventbus.AsyncEventBus$1.run(AsyncEventBus.java:117)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.ignite.IgniteCheckedException: Runtime failure on lookup row: org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$SearchRow@86f20c5
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1070)
        at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl$CacheDataStoreImpl.find(IgniteCacheOffheapManagerImpl.java:1476)
        at org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.find(GridCacheOffheapManager.java:1276)
        at org.apache.ignite.internal.processors.cache.IgniteCacheOffheapManagerImpl.read(IgniteCacheOffheapManagerImpl.java:406)
        at org.apache.ignite.internal.processors.cache.GridCacheAdapter.getAllAsync0(GridCacheAdapter.java:1902)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtAllAsync(GridDhtCacheAdapter.java:780)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.getAsync(GridDhtGetSingleFuture.java:360)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map0(GridDhtGetSingleFuture.java:254)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.map(GridDhtGetSingleFuture.java:237)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtGetSingleFuture.init(GridDhtGetSingleFuture.java:161)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.getDhtSingleAsync(GridDhtCacheAdapter.java:878)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtCacheAdapter.processNearSingleGetRequest(GridDhtCacheAdapter.java:892)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:131)
        at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTransactionalCacheAdapter$2.apply(GridDhtTransactionalCacheAdapter.java:129)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1060)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:579)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:378)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:304)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:99)
        at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:293)
        at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1562)
        at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1190)
        at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:126)
        at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1097)
        at org.apache.ignite.internal.util.StripedExecutor$Stripe.run(StripedExecutor.java:505)
        ... 1 common frames omitted
Caused by: org.apache.ignite.IgniteCheckedException: Unknown page IO type: 0
        at org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getBPlusIO(PageIO.java:540)
        at org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:457)
        at org.apache.ignite.internal.processors.cache.persistence.tree.io.PageIO.getPageIO(PageIO.java:420)
        at org.apache.ignite.internal.processors.cache.persistence.tree.util.PageHandler.readPage(PageHandler.java:157)
        at org.apache.ignite.internal.processors.cache.persistence.DataStructure.read(DataStructure.java:319)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1132)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1141)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findDown(BPlusTree.java:1141)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.doFind(BPlusTree.java:1101)
        at org.apache.ignite.internal.processors.cache.persistence.tree.BPlusTree.findOne(BPlusTree.java:1065)
        ... 25 common frames omitted
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)