Ignite 2.7 - New Issues on Node Termination

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Ignite 2.7 - New Issues on Node Termination

Gabriel Jimenez (BLOOMBERG/ 731 LEX)
Hello,

I have been working with Apache Ignite for the past couple of months, mainly 2.6, and just recently upgraded our framework to use 2.7. However, I have encountered some concerning issues on grid node termination, and was wondering if anyone else had similar experiences or could reference available solutions.

While using 2.6, our grid always successfully terminated nodes, printing out

'Ignite ver. 2.6.0#(...) stopped OK'

However, since making the upgrade to 2.7.0 I have not had a successful node termination when the grid is under load (namely during grid rolling restart). At first I thought 2.7.0 might have improved error reporting, and these were errors present before that were not being caught. After looking into it more, it just seems as if each component appears to be throwing its own variant of 'Ignite___Exception: Node is stopping'. My confidence dropped even further when I noticed one of the 'errors':

"
[ERROR] [Thread-27] IgniteKernal - Failed to stop component (ignoring): GridProcessorAdapter []
java.lang.UnsupportedOperationException: null
        at org.jsr166.ConcurrentLinkedHashMap.clear(ConcurrentLinkedHashMap.java:1551) ~[bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.processors.job.GridJobProcessor.stop(GridJobProcessor.java:264) ~[bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2356) [bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228) [bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612) [bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575) [bdp-ignite-core-2.7.0.jar:2.7.0]
        at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$6.run(IgnitionEx.java:2100) [bdp-ignite-core-2.7.0.jar:2.7.0]
"

The relevant lines of code are:

GridJobProcessor.stop(): https://github.com/apache/ignite/blob/2.7.0/modules/core/src/main/java/org/apache/ignite/internal/processors/job/GridJobProcessor.java#L264
ConcurrentLinkedHashMap.clear(): https://github.com/apache/ignite/blob/2.7.0/modules/core/src/main/java/org/jsr166/ConcurrentLinkedHashMap.java#L1550

It seems ConcurrentLinkedHashMap was recently changed to make clear an unsupported operation, but that creates what would be arguably false errors on node termination. Has anyone else encountered issues on node termination in 2.7, or perhaps the mistake is on my end and I am missing something critical. Any insight on this matter would be greatly appreciated!

Sincerely,
Gabriel
Reply | Threaded
Open this post in threaded view
|

Re: Ignite 2.7 - New Issues on Node Termination

Ilya Kasnacheev
Hello!

This is a known issue: https://issues.apache.org/jira/browse/IGNITE-10860

Regards,
--
Ilya Kasnacheev


сб, 23 февр. 2019 г. в 03:55, Gabriel Jimenez (BLOOMBERG/ 731 LEX) <
[hidden email]>:

> Hello,
>
> I have been working with Apache Ignite for the past couple of months,
> mainly 2.6, and just recently upgraded our framework to use 2.7. However, I
> have encountered some concerning issues on grid node termination, and was
> wondering if anyone else had similar experiences or could reference
> available solutions.
>
> While using 2.6, our grid always successfully terminated nodes, printing
> out
>
> 'Ignite ver. 2.6.0#(...) stopped OK'
>
> However, since making the upgrade to 2.7.0 I have not had a successful
> node termination when the grid is under load (namely during grid rolling
> restart). At first I thought 2.7.0 might have improved error reporting, and
> these were errors present before that were not being caught. After looking
> into it more, it just seems as if each component appears to be throwing its
> own variant of 'Ignite___Exception: Node is stopping'. My confidence
> dropped even further when I noticed one of the 'errors':
>
> "
> [ERROR] [Thread-27] IgniteKernal - Failed to stop component (ignoring):
> GridProcessorAdapter []
> java.lang.UnsupportedOperationException: null
>         at
> org.jsr166.ConcurrentLinkedHashMap.clear(ConcurrentLinkedHashMap.java:1551)
> ~[bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.processors.job.GridJobProcessor.stop(GridJobProcessor.java:264)
> ~[bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.IgniteKernal.stop0(IgniteKernal.java:2356)
> [bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.IgniteKernal.stop(IgniteKernal.java:2228)
> [bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop0(IgnitionEx.java:2612)
> [bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.stop(IgnitionEx.java:2575)
> [bdp-ignite-core-2.7.0.jar:2.7.0]
>         at
> org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance$6.run(IgnitionEx.java:2100)
> [bdp-ignite-core-2.7.0.jar:2.7.0]
> "
>
> The relevant lines of code are:
>
> GridJobProcessor.stop():
> https://github.com/apache/ignite/blob/2.7.0/modules/core/src/main/java/org/apache/ignite/internal/processors/job/GridJobProcessor.java#L264
> ConcurrentLinkedHashMap.clear():
> https://github.com/apache/ignite/blob/2.7.0/modules/core/src/main/java/org/jsr166/ConcurrentLinkedHashMap.java#L1550
>
> It seems ConcurrentLinkedHashMap was recently changed to make clear an
> unsupported operation, but that creates what would be arguably false errors
> on node termination. Has anyone else encountered issues on node termination
> in 2.7, or perhaps the mistake is on my end and I am missing something
> critical. Any insight on this matter would be greatly appreciated!
>
> Sincerely,
> Gabriel