Yakov Zhdanov created IGNITE-5125:
------------------------------------- Summary: Need to improve logging in case of hang Key: IGNITE-5125 URL: https://issues.apache.org/jira/browse/IGNITE-5125 Project: Ignite Issue Type: Improvement Reporter: Yakov Zhdanov Assignee: Yakov Zhdanov Priority: Critical Fix For: 2.1 1. When cache operation hangs on node it is not reported as hanged although partition map exchange cannot finish. {noformat} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007fdd260fd7c0> (a org.apache.ignite.internal.util.future.GridEmbeddedFuture) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:964) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1282) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:119) at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2356) at org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2354) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4168) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put0(GridCacheAdapter.java:2354) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2335) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2312) at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1379) {noformat} {noformat} java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007fa0698fee50> (a org.apache.ignite.internal.processors.cache.distributed.dht.GridPartitionedSingleGetFuture) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:161) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:119) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get0(GridCacheAdapter.java:4570) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:4544) at org.apache.ignite.internal.processors.cache.GridCacheAdapter.get(GridCacheAdapter.java:1428) at org.apache.ignite.internal.processors.cache.GridCacheProxyImpl.get(GridCacheProxyImpl.java:329) at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.getAtomic(DataStructuresProcessor.java:589) at org.apache.ignite.internal.processors.datastructures.DataStructuresProcessor.sequence(DataStructuresProcessor.java:397) {noformat} 2. Partition exchnage future dumps objects only limited number of times. I would suggest to switch to mode when we double the delay between dumps each time, but no more than 30min 3. If exchange worker is stuck at GridDhtPartitionsExchangeFuture.waitPartitionRelease then unreleased partitions should be reported (same rules as of pt 2 apply) {noformat} "exchange-worker-#93%...%" #143 prio=5 os_prio=0 tid=0x00007fd782df3000 nid=0x1526 waiting on condition [0x00007fc2dc9c5000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00007fcab30006b8> (a org.apache.ignite.internal.util.future.GridCompoundFuture) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) at java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) at org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:189) at org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:139) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.waitPartitionRelease(GridDhtPartitionsExchangeFuture.java:980) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.distributedExchange(GridDhtPartitionsExchangeFuture.java:907) at org.apache.ignite.internal.processors.cache.distributed.dht.preloader.GridDhtPartitionsExchangeFuture.init(GridDhtPartitionsExchangeFuture.java:617) at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:1701) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian JIRA (v6.3.15#6346) |
Free forum by Nabble | Edit this page |