Semen Boikov created IGNITE-6174:
------------------------------------
Summary: Exchange may do not wait for tx completion in case node failures
Key: IGNITE-6174
URL:
https://issues.apache.org/jira/browse/IGNITE-6174 Project: Ignite
Issue Type: Bug
Reporter: Semen Boikov
Assignee: Semen Boikov
Fix For: 2.2
Very good reproduces in IgniteCachePartitionedNearDisabledPrimaryNodeFailureRecoveryTest.testOptimisticPrimaryAndOriginatingNodeFailureRecovery1.
Approximate scenario:
- there are several nodes in topology
- node A starts tx prepare
- one node fails, exchange is started
- all servers except node A finishes 'waitPartitionRelease()' and coordinator waits for node A
- node A also fails, but it was able to send prepare request and it will be processed
- since node A failed others nodes can finish exchange, without waiting for tx completion
Note: to increase possibility change nodes start order in IgniteCacheAbstractTest:
{noformat}
protected void startGrids() throws Exception {
int cnt = gridCount();
assert cnt >= 1 : "At least one grid must be started";
//startGridsMultiThreaded(cnt);
startGrid(0);
startGrid(1);
startGrid(3);
startGrid(2);
awaitPartitionMapExchange();
}
{noformat}
It seems one possible solution: exchange should wait when all messages from failed node are processed.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)