[jira] [Created] (IGNITE-10636) Deadlock on stopping node due to segmentation

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-10636) Deadlock on stopping node due to segmentation

Anton Vinogradov (Jira)
Anton Kalashnikov created IGNITE-10636:
------------------------------------------

             Summary: Deadlock on stopping node due to segmentation
                 Key: IGNITE-10636
                 URL: https://issues.apache.org/jira/browse/IGNITE-10636
             Project: Ignite
          Issue Type: Bug
            Reporter: Anton Kalashnikov


* Node have "put" operations
* Node detected segmentation
* Node do call to failulre handler(StopNodeFailureHandler) to stop itself
* Failure handler try to get GridKernalGateway write lock but await all operation finished
* GridNearTxLocal uninterruptebly await rollbackNearTxLocalAsync future

Failure handler await:
{noformat}
Lock [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@2370ac7a, ownerName=null, ownerId=-1]
[03:24:53] :     [Step 4/5]         at sun.misc.Unsafe.park(Native Method)
[03:24:53] :     [Step 4/5]         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
[03:24:53] :     [Step 4/5]         at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireNanos(AbstractQueuedSynchronizer.java:934)
[03:24:53] :     [Step 4/5]         at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireNanos(AbstractQueuedSynchronizer.java:1247)
[03:24:53] :     [Step 4/5]         at java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.tryLock(ReentrantReadWriteLock.java:1115)
[03:24:53] :     [Step 4/5]         at o.a.i.i.util.StripedCompositeReadWriteLock$WriteLock.tryLock(StripedCompositeReadWriteLock.java:220)
[03:24:53] :     [Step 4/5]         at o.a.i.i.GridKernalGatewayImpl.tryWriteLock(GridKernalGatewayImpl.java:143)
[03:24:53] :     [Step 4/5]         at o.a.i.i.IgniteKernal.stop0(IgniteKernal.java:2313)
[03:24:53] :     [Step 4/5]         at o.a.i.i.IgniteKernal.stop(IgniteKernal.java:2230)
[03:24:53] :     [Step 4/5]         at o.a.i.i.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2613)
[03:24:53] :     [Step 4/5]         - locked o.a.i.i.IgnitionEx$IgniteNamedInstance@41294371
[03:24:53] :     [Step 4/5]         at o.a.i.i.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2576)
[03:24:53] :     [Step 4/5]         at o.a.i.i.IgnitionEx.stop(IgnitionEx.java:379)
[03:24:53] :     [Step 4/5]         at o.a.i.failure.StopNodeFailureHandler$1.run(StopNodeFailureHandler.java:36)
[03:24:53] :     [Step 4/5]         at java.lang.Thread.run(Thread.java:748)
{noformat}
Put await:
{noformat}
java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
        at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
        at org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.close(GridNearTxLocal.java:4358)
        at org.apache.ignite.internal.processors.cache.GridCacheSharedContext.endTx(GridCacheSharedContext.java:1017)
        at org.apache.ignite.internal.processors.cache.transactions.TransactionProxyImpl.close(TransactionProxyImpl.java:329)
        at org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest$3.run(GridCacheAbstractNodeRestartSelfTest.java:782)
        at java.lang.Thread.run(Thread.java:748)
{noformat}


Reproduced by GridCacheAbstractNodeRestartSelfTest#testRestartWithPutTenNodesTwoBackups and other tests from this class



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)