[jira] [Created] (IGNITE-12496) Index deletion blocks checkpoint for all of its duration, which can cause "Critical system error: system critical thread blocked"

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-12496) Index deletion blocks checkpoint for all of its duration, which can cause "Critical system error: system critical thread blocked"

Anton Vinogradov (Jira)
Denis Chudov created IGNITE-12496:
-------------------------------------

             Summary: Index deletion blocks checkpoint for all of its duration, which can cause "Critical system error: system critical thread blocked"
                 Key: IGNITE-12496
                 URL: https://issues.apache.org/jira/browse/IGNITE-12496
             Project: Ignite
          Issue Type: Bug
            Reporter: Denis Chudov
            Assignee: Denis Chudov


GridH2Table#removeIndex(Session, Index) acquires checkpoint read lock and releases it only after full completion of deletion process. It happens because H2TreeIndex#destroy requires to be run when checkpoint lock is held. Meanwhile, checkpoint thread stops on Checkpointer#markCheckpointBegin, trying to acquire write lock, and stays locked for all the time of index deletion.

The possible fix is that checkpoint read lock is periodically released while index deletion is in progress. To avoid persistence corruption in case of node crush in the middle of the process, we should put index root into some persistent structure like index meta tree and remember it as "pending delete". Then we must delete tree pages from leafs to root, this allows to avoid links to deleted pages. When deletion is complete, tree root can be removed from "pending delete".



--
This message was sent by Atlassian Jira
(v8.3.4#803005)