[jira] [Created] (IGNITE-9679) Document critical workers liveness checking implementation

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (IGNITE-9679) Document critical workers liveness checking implementation

Anton Vinogradov (Jira)
Andrey Kuznetsov created IGNITE-9679:
----------------------------------------

             Summary: Document critical workers liveness checking implementation
                 Key: IGNITE-9679
                 URL: https://issues.apache.org/jira/browse/IGNITE-9679
             Project: Ignite
          Issue Type: Task
          Components: documentation
            Reporter: Andrey Kuznetsov
            Assignee: Denis Magda
             Fix For: 2.7


Newly implemented critical worker thread liveness checks should be mentioned in Ignite Documentation. Brief description of the functionality follows.

Ignite node has a number of critical worker threads that should be alive and responsive, otherwise node's health is not guaranteed. These threads monitor each other periodically and track two aspects for a thread being checked:
- whether it's alive;
- whether it updates its internal heartbeat timestamp.
Both checks use {{IgniteConfiguration.failureDetectionTimeout}} property as a threshold value.
Whenever at least one of the above conditions is violated, checker thread logs the error and calls currently configured {{FailureHandler}}.

Liveness checks are enabled by default, but can be disabled through {{WorkersControlMXBean.healthMonitoringEnabled}} property.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)