This post was updated on .
Hi Team,
We have a 3 node server cluster A 4th node joins as a client with a continuous query on a Table A( Transaction_mode = transactional ). Now If I bring the client down and issue an update to the Table A within failureDetectionTimeout 30000 , I get the following error and */this error brings the server down/*: "(err) Failed to notify listener: GridDhtTxPrepareFuture Error" =================================== Basically the server , tries to update the record on the Table A, and tries to notify Client since it had registered a continuous query for Table A. But since the Client Node has been brought down, it undeploys the remotefilterfactory lambda. Hence the server is no longer able to complete the transaction . */This also brings the server down./ * How can I resolve this issue ? ======================================= Please find the complete stack trace for this error : [12:14:12] (err) Failed to notify listener: GridDhtTxPrepareFuture [futId=0a69e79c071-93faf34d-a776-4166-9f3b-4b5a0f54b8f9, err=null, replied=1, mapped=1, req=GridNearTxPrepareRequest [futId=4250e79c071-51438f4f-c061-45f7-b34e-57c90f2055e9, miniId=1, topVer=AffinityTopologyVersion [topVer=7, minorTopVer=0], subjId=da486d0b-36a1-43d4-b05b-47d126fd880e, taskNameHash=0, flags=[implicitSingle], super=GridDistributedTxPrepareRequest [threadId=382, concurrency=OPTIMISTIC, isolation=READ_COMMITTED, writeVer=GridCacheVersion [topVer=195408427, order=1583928843624, nodeOrder=1], timeout=1000, reads=null, writes=[IgniteTxEntry [key=ABCKEY [idHash=1413504800, hash=-1419375634, VALUETYPE=somevaluetype, NAME=TEST4375234], cacheId=-1512899836, txKey=IgniteTxKey [key=ABCKEY [idHash=1413504800, hash=-1419375634, VALUETYPE=somevaluetype, NAME=TEST4375234], cacheId=-1512899836], val=[op=CREATE, val=ABC [idHash=108633195, hash=-965148880, ACTIVE=true, MODIFICATIONDATE=2020-02-03 18:29:03.501, VALUETYPE=null, SCHEMAREF=null, VALUE=DEV, MACHINENAME=null, COMMENT=null, NAME=null, APPLICATIONNAME=null, SCHEMANAME=null, KEYNAME=ENVIRONMENT, USERNAME=null, INTERNALVERSION=null, MODIFICATIONTYPE=null]], prevVal=[op=NOOP, val=null], oldVal=[op=NOOP, val=null], entryProcessorsCol=null, ttl=-1, conflictExpireTime=-1, conflictVer=null, explicitVer=null, dhtVer=null, filters=[o.a.i.i.processors.cache.CacheEntrySerializablePredicate@388c822f], filtersPassed=false, filtersSet=false, entry=GridDhtCacheEntry [rdrs=[], part=136, super=GridDistributedCacheEntry [super=GridCacheMapEntry [key=ABCKEY [idHash=1413504800, hash=-1419375634, VALUETYPE=somevaluetype, NAME=TEST4375234], val=null, ver=GridCacheVersion [topVer=195408427, order=1583928843625, nodeOrder=4], hash=-1419375634, extras=GridCacheObsoleteEntryExtras [obsoleteVer=GridCacheVersion [topVer=2147483647, order=0, nodeOrder=0]], flags=2]]], prepared=1, locked=false, nodeId=null, locMapped=false, expiryPlc=null, transferExpiryPlc=false, flags=2, partUpdateCntr=0, serReadVer=null, xidVer=null]], dhtVers=null, txSize=0, plc=2, txState=IgniteTxImplicitSingleStateImpl [init=true, recovery=false], flags=onePhase|last, super=GridDistributedBaseMessage [ver=GridCacheVersion [topVer=195408427, order=1583928843624, nodeOrder=1], committedVers=null, rolledbackVers=null, cnt=0, super=GridCacheIdMessage [cacheId=0]]]], trackable=true, nearMiniId=1, last=true, retVal=false, ret=GridCacheReturn [v=null, cacheObj=null, success=true, invokeRes=false, loc=false, cacheId=0], lockKeys=[], forceKeysFut=null, locksReady=true, invoke=false, timeoutObj=PrepareTimeoutObject [timeout=1000], xid=GridCacheVersion [topVer=195408427, order=1583928843625, nodeOrder=4], innerFuts=[[node=da486d0b-36a1-43d4-b05b-47d126fd880e, loc=false, done=true]], super=GridCompoundFuture [rdc=o.a.i.i.processors.cache.distributed.dht.GridDhtTxPrepareFuture$1@73415bf, initFlag=1, lsnrCalls=1, done=true, cancelled=false, err=null, futs=[true]]]java.lang.NoClassDefFoundError: com/companyname/abc/configstore/helper/ContinuousQueryHelper at com.companyname.abc.configstore.helper.ContinuousQueryHelper$ConfigStoreTableRemoteFilterFactory$1.evaluate(ContinuousQueryHelper.java:293) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler.filter(CacheContinuousQueryHandler.java:833) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:422) at org.apache.ignite.internal.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:426) at org.apache.ignite.internal.processors.cache.GridCacheMapEntry.innerSet(GridCacheMapEntry.java:1584) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.userCommit(IgniteTxLocalAdapter.java:741) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocalAdapter.localFinish(GridDhtTxLocalAdapter.java:796) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.localFinish(GridDhtTxLocal.java:584) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.finishTx(GridDhtTxLocal.java:463) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitDhtLocalAsync(GridDhtTxLocal.java:516) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxLocal.commitAsync(GridDhtTxLocal.java:525) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onDone(GridDhtTxPrepareFuture.java:758) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onDone(GridDhtTxPrepareFuture.java:110) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453) at org.apache.ignite.internal.util.future.GridCompoundFuture.checkComplete(GridCompoundFuture.java:285) at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:144) at org.apache.ignite.internal.util.future.GridCompoundFuture.apply(GridCompoundFuture.java:45) at org.apache.ignite.internal.util.future.GridFutureAdapter.notifyListener(GridFutureAdapter.java:385) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblock(GridFutureAdapter.java:349) at org.apache.ignite.internal.util.future.GridFutureAdapter.unblockAll(GridFutureAdapter.java:337) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:497) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:476) at org.apache.ignite.internal.util.future.GridFutureAdapter.onDone(GridFutureAdapter.java:453) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture$MiniFuture.onResult(GridDhtTxPrepareFuture.java:1948) at org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareFuture.onResult(GridDhtTxPrepareFuture.java:572) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.processDhtTxPrepareResponse(IgniteTxHandler.java:798) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler.access$500(IgniteTxHandler.java:119) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$6.apply(IgniteTxHandler.java:229) at org.apache.ignite.internal.processors.cache.transactions.IgniteTxHandler$6.apply(IgniteTxHandler.java:227) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:1056) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:581) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:380) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:306) at org.apache.ignite.internal.processors.cache.GridCacheIoManager.access$100(GridCacheIoManager.java:101) at org.apache.ignite.internal.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:295) at org.apache.ignite.internal.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1569) at org.apache.ignite.internal.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:1197) at org.apache.ignite.internal.managers.communication.GridIoManager.access$4200(GridIoManager.java:127) at org.apache.ignite.internal.managers.communication.GridIoManager$9.run(GridIoManager.java:1093) at org.apache.ignite.internal.util.StripedExecutor$Stripe.body(StripedExecutor.java:505) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: Failed to peer load class [class=com.companyname.abc.configstore.helper.ContinuousQueryHelper, nodeClsLdrs={fb2b9513-a763-488a-86b8-39d80e18427f=35f0489c071-fb2b9513-a763-488a-86b8-39d80e18427f}, parentClsLoader=sun.misc.Launcher$AppClassLoader@73d16e93] at org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendClassRequest(GridDeploymentClassLoader.java:661) at org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.findClass(GridDeploymentClassLoader.java:508) at java.lang.ClassLoader.loadClass(ClassLoader.java:418) at org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.loadClass(GridDeploymentClassLoader.java:440) ... 42 more Caused by: class org.apache.ignite.IgniteCheckedException: Failed to send message (node may have left the grid or TCP connection cannot be established due to firewall issues) [node=TcpDiscoveryNode [id=fb2b9513-a763-488a-86b8-39d80e18427f, addrs=[0:0:0:0:0:0:0:1, x.x.x.100, 127.0.0.1], sockAddrs=[machinename.companyname.LOCAL/x.x.x.100:0, /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=7, intOrder=5, lastExchangeTime=1583928842125, loc=false, ver=2.7.6#20190911-sha1:21f7ca41, isClient=true], topic=TOPIC_CLASSLOAD, msg=GridDeploymentRequest [rsrcName=com/companyname/abc/configstore/helper/ContinuousQueryHelper.class, ldrId=35f0489c071-fb2b9513-a763-488a-86b8-39d80e18427f, isUndeploy=false, nodeIds=null], policy=1] at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1667) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1731) at org.apache.ignite.internal.managers.deployment.GridDeploymentCommunication.sendResourceRequest(GridDeploymentCommunication.java:454) at org.apache.ignite.internal.managers.deployment.GridDeploymentClassLoader.sendClassRequest(GridDeploymentClassLoader.java:601) ... 45 more Caused by: class org.apache.ignite.spi.IgniteSpiException: Failed to send message to remote node: TcpDiscoveryNode [id=fb2b9513-a763-488a-86b8-39d80e18427f, addrs=[0:0:0:0:0:0:0:1, x.x.x.100, 127.0.0.1], sockAddrs=[machinename.companyname.LOCAL/x.x.x.100:0, /0:0:0:0:0:0:0:1:0, /127.0.0.1:0], discPort=0, order=7, intOrder=5, lastExchangeTime=1583928842125, loc=false, ver=2.7.6#20190911-sha1:21f7ca41, isClient=true] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2747) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2672) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1656) ... 48 more Caused by: class org.apache.ignite.IgniteCheckedException: Failed to connect to node (is node still alive?). Make sure that each ComputeTask and cache Transaction has a timeout set in order to prevent parties from waiting forever in case of network issues [nodeId=fb2b9513-a763-488a-86b8-39d80e18427f, addrs=[/127.0.0.1:47102, /0:0:0:0:0:0:0:1:47102, machinename.companyname.LOCAL/x.x.x.100:47102]] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3459) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2987) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2870) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.access$6000(TcpCommunicationSpi.java:271) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.processDisconnect(TcpCommunicationSpi.java:4489) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$CommunicationWorker.body(TcpCommunicationSpi.java:4294) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi$5.body(TcpCommunicationSpi.java:2237) at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62) Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=/127.0.0.1:47102, err=Connection refused: no further information] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3462) ... 8 more Caused by: java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299) ... 8 more Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=/0:0:0:0:0:0:0:1:47102, err=Connection refused: no further information] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3462) ... 8 more Caused by: java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299) ... 8 more Suppressed: class org.apache.ignite.IgniteCheckedException: Failed to connect to address [addr=machinename.companyname.LOCAL/x.x.x.100:47102, err=Connection refused: no further information] at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3462) ... 8 more Caused by: java.net.ConnectException: Connection refused: no further information at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:111) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299) -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Hi ,
Did anyone get a chance to look at this ? Summary of the issue I am facing. : We have a 3 node server cluster A 4th node joins as a client with a continuous query on a Table A( Transaction_mode = transactional ). Now If I bring the client down and issue an update to the Table A within failureDetectionTimeout 30000 , I get the following error and */this error brings the server down/*: "(err) Failed to notify listener: GridDhtTxPrepareFuture Error" =================================== Basically the server , tries to update the record on the Table A, and tries to notify Client since it had registered a continuous query for Table A. But since the Client Node has been brought down, it undeploys the remotefilterfactory lambda. Hence the server is no longer able to complete the transaction . */This also brings the server down./ * regards, Veena. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Hello!
But why the node was down? Was it due to failure handler? Unhandled exception in critical thread? Anything else? Did you find any work-around? I have not heard about Continuous Query issues such as the one you are describing, and I suggest filing an IGNITE ticket with details. Regards, -- Ilya Kasnacheev чт, 12 мар. 2020 г. в 16:39, VeenaMithare <[hidden email]>: > Hi , > > Did anyone get a chance to look at this ? > Summary of the issue I am facing. : > > We have a 3 node server cluster > > A 4th node joins as a client with a continuous query on a Table A( > Transaction_mode = transactional ). > > Now If I bring the client down and issue an update to the Table A within > failureDetectionTimeout 30000 , I get the following error and */this error > brings the server down/*: > > "(err) Failed to notify listener: GridDhtTxPrepareFuture Error" > =================================== > Basically the server , tries to update the record on the Table A, and tries > to notify Client since it had registered a continuous query for Table A. > But since the Client Node has been brought down, it undeploys the > remotefilterfactory lambda. Hence the server is no longer able to complete > the > transaction . > > */This also brings the server down./ > * > regards, > Veena. > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
This post was updated on .
Raised this jira :
https://issues.apache.org/jira/browse/IGNITE-12784 Observed in 2.7.6. Unable to easily test in 2.8.0 because of other issues. One of them being - http://apache-ignite-users.70518.x6.nabble.com/2-8-0-JDBC-Thin-Client-Unable-to-load-the-tables-via-DBeaver-td31681.html -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Thanks for the report!
The issue here is that a remote filter for a continuous query is loaded using peer class loading, and other classes that this remote filter depends on can be lazily loaded during its work. Loading every dependency class involves going to the node where the originating class was loading from, and asking that node to send missing classes over the network. The issues begin when this node is not in the cluster anymore, and the continuous query wasn't undeployed yet. A server sends a request for a class to a node that is not available, but wasn't kicked out of the topology yet, since a failure detection timeout hasn't elapsed yet. It leads to a NoClasDefFound exception that you observe in the logs. The biggest issue here is that this exception triggers a failure handler that makes the whole node go down. I would expect that only one request would fail, but not the whole node. As a temporary solution you can stop relying on peer class loading for continuous queries and provide the code of remote filters to the classpath of server nodes. This way no lazy class loading will be performed over the network since they will all be available locally. Denis пт, 13 мар. 2020 г. в 20:39, VeenaMithare <[hidden email]>: > Raised this jira : > https://issues.apache.org/jira/browse/IGNITE-12784 > > Observed in 2.7.6. Unable to easily test in 2.8.0 because of other issues. > One of them being - > > http://apache-ignite-users.70518.x6.nabble.com/2-8-0-JDBC-Thin-Client-Unable-to-load-the-tables-via-DBeaver-td31681.html > Please note this happens > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
>>As a temporary solution you can stop relying on peer class loading for
continuous queries and provide the code of remote filters to the classpath of server nodes. Yes.. I was thinking of a solution on similar lines. Thank you for the reply. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Free forum by Nabble | Edit this page |