[jira] [Created] (IGNITE-7278) Node failed to recover partition from WAL on unstable topology.

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-7278) Node failed to recover partition from WAL on unstable topology.

Anton Vinogradov (Jira)
Andrew Mashenkov created IGNITE-7278:
----------------------------------------

             Summary: Node failed to recover partition from WAL on unstable topology.
                 Key: IGNITE-7278
                 URL: https://issues.apache.org/jira/browse/IGNITE-7278
             Project: Ignite
          Issue Type: Bug
          Components: persistence
            Reporter: Andrew Mashenkov
             Fix For: 2.4


The use case is:
-Grid with partitioned cache with 2 backups (or replicated)
-Node-1 is killed in the middle of checkpoint and started again.
-Node-1 detects unfinished checkpoint and tries to recover it.
-At this point Node-2 is killed while node-1 recovering is in progress.
-Node-1 fails with AssertionError.

PFA logs, parsed WAL, reproducer.

Can be reproduced with IgnitePdsContinuousRestartTest with minor changes,
we have to have 2 nodes flapping and kill nodes ungracefully.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)