Ivan Artukhov created IGNITE-8497:
------------------------------------- Summary: Ignite stops the node in the middle of checkpointing upon receiving a SIGINT Key: IGNITE-8497 URL: https://issues.apache.org/jira/browse/IGNITE-8497 Project: Ignite Issue Type: Bug Components: persistence Affects Versions: 2.4 Environment: Ubuntu 17.10 Reporter: Ivan Artukhov Attachments: example-cache.xml, srv.1.log, srv.2.log *Steps* Start Ignite server node with enabled PDS (see the attached [^example-cache.xml] config file) Activate the cluster with _./bin/control.sh --activate_ Put some data into cluster (with _CachePutGetExample.java_ for example) Stop Ignite server node with SIGINT *Actual result* Ignite server node invokes the shutdown hook, checkpoint procedure starts, but Ignite node *does not wait for checkpoint to finish* and terminates the node. An excerpt from [^srv.1.log] : {noformat} [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook... [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command protocol successfully stopped: TCP binary [2018-05-15 15:20:59,998][INFO ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3, startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871], checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167, reason='timeout'] [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=default] [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=ignite-sys-cache] [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped cache [cacheName=CachePutGetExample] [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal] >>> +-----------------------------------------------------+ >>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK >>> +-----------------------------------------------------+ >>> Grid uptime: 00:00:36.228 {noformat} When one starts the node again, the following warning appears in the log ( [^srv.2.log] ): {noformat} [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager] Ignite node stopped in the middle of checkpoint. Will restore memory state and finish checkpoint on node start. {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) |
Hi Igniters, Ivan,
To my mind it is not a bug. Ignite would be able to restore memory state without waiting checkpoint to be completed. Note checkpoint may be very long running operation. Sincerely, Dmitriy Pavlov вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <[hidden email]>: > Ivan Artukhov created IGNITE-8497: > ------------------------------------- > > Summary: Ignite stops the node in the middle of checkpointing > upon receiving a SIGINT > Key: IGNITE-8497 > URL: https://issues.apache.org/jira/browse/IGNITE-8497 > Project: Ignite > Issue Type: Bug > Components: persistence > Affects Versions: 2.4 > Environment: Ubuntu 17.10 > Reporter: Ivan Artukhov > Attachments: example-cache.xml, srv.1.log, srv.2.log > > *Steps* > Start Ignite server node with enabled PDS (see the attached > [^example-cache.xml] config file) > Activate the cluster with _./bin/control.sh --activate_ > Put some data into cluster (with _CachePutGetExample.java_ for example) > Stop Ignite server node with SIGINT > > *Actual result* > Ignite server node invokes the shutdown hook, checkpoint procedure starts, > but Ignite node *does not wait for checkpoint to finish* and terminates the > node. > > An excerpt from [^srv.1.log] : > {noformat} > [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook... > [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command > protocol successfully stopped: TCP binary > [2018-05-15 15:20:59,998][INFO > ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint > started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3, > startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871], > checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167, > reason='timeout'] > [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped > cache [cacheName=default] > [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped > cache [cacheName=ignite-sys-cache] > [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped > cache [cacheName=CachePutGetExample] > [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal] > > >>> +-----------------------------------------------------+ > >>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK > >>> +-----------------------------------------------------+ > >>> Grid uptime: 00:00:36.228 > {noformat} > > When one starts the node again, the following warning appears in the log ( > [^srv.2.log] ): > {noformat} > [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager] > Ignite node stopped in the middle of checkpoint. Will restore memory state > and finish checkpoint on node start. > {noformat} > > > > -- > This message was sent by Atlassian JIRA > (v7.6.3#76005) > |
It's a regular priority bug and should be fixed.
Issue doesn't cause any kind of data loss. It's harmless, but still undesirable: even if checkpoint wasn't running, it will be triggered and then immediately interrupted by Ignition.stop(true). Such behavior increases time of following node startup. I added "always" to the ticket summary to avoid misunderstanding. Best Regards, Ivan Rakov On 15.05.2018 15:54, Dmitry Pavlov wrote: > Hi Igniters, Ivan, > > To my mind it is not a bug. Ignite would be able to restore memory state > without waiting checkpoint to be completed. Note checkpoint may be very > long running operation. > > Sincerely, > Dmitriy Pavlov > > вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <[hidden email]>: > >> Ivan Artukhov created IGNITE-8497: >> ------------------------------------- >> >> Summary: Ignite stops the node in the middle of checkpointing >> upon receiving a SIGINT >> Key: IGNITE-8497 >> URL: https://issues.apache.org/jira/browse/IGNITE-8497 >> Project: Ignite >> Issue Type: Bug >> Components: persistence >> Affects Versions: 2.4 >> Environment: Ubuntu 17.10 >> Reporter: Ivan Artukhov >> Attachments: example-cache.xml, srv.1.log, srv.2.log >> >> *Steps* >> Start Ignite server node with enabled PDS (see the attached >> [^example-cache.xml] config file) >> Activate the cluster with _./bin/control.sh --activate_ >> Put some data into cluster (with _CachePutGetExample.java_ for example) >> Stop Ignite server node with SIGINT >> >> *Actual result* >> Ignite server node invokes the shutdown hook, checkpoint procedure starts, >> but Ignite node *does not wait for checkpoint to finish* and terminates the >> node. >> >> An excerpt from [^srv.1.log] : >> {noformat} >> [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook... >> [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command >> protocol successfully stopped: TCP binary >> [2018-05-15 15:20:59,998][INFO >> ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint >> started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3, >> startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871], >> checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167, >> reason='timeout'] >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped >> cache [cacheName=default] >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped >> cache [cacheName=ignite-sys-cache] >> [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped >> cache [cacheName=CachePutGetExample] >> [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal] >> >>>>> +-----------------------------------------------------+ >>>>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK >>>>> +-----------------------------------------------------+ >>>>> Grid uptime: 00:00:36.228 >> {noformat} >> >> When one starts the node again, the following warning appears in the log ( >> [^srv.2.log] ): >> {noformat} >> [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager] >> Ignite node stopped in the middle of checkpoint. Will restore memory state >> and finish checkpoint on node start. >> {noformat} >> >> >> >> -- >> This message was sent by Atlassian JIRA >> (v7.6.3#76005) >> |
HI Ivan R. ,
I understand now, thank you. Probably we could use title "Ignite triggers checkpoint upon receiving a SIGINT even if it not required" вт, 15 мая 2018 г. в 16:08, Ivan Rakov <[hidden email]>: > It's a regular priority bug and should be fixed. > Issue doesn't cause any kind of data loss. It's harmless, but still > undesirable: even if checkpoint wasn't running, it will be triggered and > then immediately interrupted by Ignition.stop(true). Such behavior > increases time of following node startup. > > I added "always" to the ticket summary to avoid misunderstanding. > > Best Regards, > Ivan Rakov > > On 15.05.2018 15:54, Dmitry Pavlov wrote: > > Hi Igniters, Ivan, > > > > To my mind it is not a bug. Ignite would be able to restore memory state > > without waiting checkpoint to be completed. Note checkpoint may be very > > long running operation. > > > > Sincerely, > > Dmitriy Pavlov > > > > вт, 15 мая 2018 г. в 15:35, Ivan Artukhov (JIRA) <[hidden email]>: > > > >> Ivan Artukhov created IGNITE-8497: > >> ------------------------------------- > >> > >> Summary: Ignite stops the node in the middle of > checkpointing > >> upon receiving a SIGINT > >> Key: IGNITE-8497 > >> URL: > https://issues.apache.org/jira/browse/IGNITE-8497 > >> Project: Ignite > >> Issue Type: Bug > >> Components: persistence > >> Affects Versions: 2.4 > >> Environment: Ubuntu 17.10 > >> Reporter: Ivan Artukhov > >> Attachments: example-cache.xml, srv.1.log, srv.2.log > >> > >> *Steps* > >> Start Ignite server node with enabled PDS (see the attached > >> [^example-cache.xml] config file) > >> Activate the cluster with _./bin/control.sh --activate_ > >> Put some data into cluster (with _CachePutGetExample.java_ for example) > >> Stop Ignite server node with SIGINT > >> > >> *Actual result* > >> Ignite server node invokes the shutdown hook, checkpoint procedure > starts, > >> but Ignite node *does not wait for checkpoint to finish* and terminates > the > >> node. > >> > >> An excerpt from [^srv.1.log] : > >> {noformat} > >> [2018-05-15 15:20:59,976][INFO ][Thread-3][G] Invoking shutdown hook... > >> [2018-05-15 15:20:59,979][INFO ][Thread-3][GridTcpRestProtocol] Command > >> protocol successfully stopped: TCP binary > >> [2018-05-15 15:20:59,998][INFO > >> ][db-checkpoint-thread-#50][GridCacheDatabaseSharedManager] Checkpoint > >> started [checkpointId=f0dde95a-6027-40dd-b3f3-4311aa8508c3, > >> startPtr=FileWALPointer [idx=0, fileOff=460751, len=40871], > >> checkpointLockWait=0ms, checkpointLockHoldTime=6ms, pages=167, > >> reason='timeout'] > >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped > >> cache [cacheName=default] > >> [2018-05-15 15:21:00,011][INFO ][Thread-3][GridCacheProcessor] Stopped > >> cache [cacheName=ignite-sys-cache] > >> [2018-05-15 15:21:00,012][INFO ][Thread-3][GridCacheProcessor] Stopped > >> cache [cacheName=CachePutGetExample] > >> [2018-05-15 15:21:00,049][INFO ][Thread-3][IgniteKernal] > >> > >>>>> +-----------------------------------------------------+ > >>>>> Ignite ver. 2.4.0-SNAPSHOT#19700101-sha1:DEV stopped OK > >>>>> +-----------------------------------------------------+ > >>>>> Grid uptime: 00:00:36.228 > >> {noformat} > >> > >> When one starts the node again, the following warning appears in the > log ( > >> [^srv.2.log] ): > >> {noformat} > >> [2018-05-15 15:21:39,848][WARN ][main][GridCacheDatabaseSharedManager] > >> Ignite node stopped in the middle of checkpoint. Will restore memory > state > >> and finish checkpoint on node start. > >> {noformat} > >> > >> > >> > >> -- > >> This message was sent by Atlassian JIRA > >> (v7.6.3#76005) > >> > > |
Free forum by Nabble | Edit this page |