Apache Ignite Developers - Legacy Mail Archive

Significant CPU and memory consumption

Classic

List

Threaded

4 messages Options

Alexey Kuznetsov-2

Significant CPU and memory consumption

Steps to reproduce.

1. Start node with partitioned cache and load to cache > 1M indexed entries.
In my case I used datasteramer
Wait while data loaded.
2. Start one more node. It will FAILED (!!!!) to join topology.
And in VisualVM I see that first node consuming 25% of CPU and on
sampler page I see that first node consume CPU in following methods:
GridCacheMapEntry.deletedUnlocked()
GridCacheMapEntry.checkExpired()

Other scenario.

1. Start couple of nodes without load.
2. Start node with load. In this case all nodes joined topology.
3. After load is finished in VisulaVM observed CPU consumption in:
GridCacheMapEntry.deletedUnlocked()
GridCacheMapEntry.checkExpired()

The more entries will be loaded in cache 2M, 3M than more CPU will be
consumed.

--
Alexey Kuznetsov
GridGain Systems
www.gridgain.com

Alexey Goncharuk

Re: Significant CPU and memory consumption

We narrowed down the root cause of this issue, it is caused by discovery
thread iterating over the whole cache when collecting cache metrics. We
fixed isEmpty method and this case looks ok now. Alexey will verify the
rest of the tests are good. I think we should revoke the vote since this is
a critical issue.

2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:

> Steps to reproduce.
>
> 1. Start node with partitioned cache and load to cache > 1M indexed
> entries.
> In my case I used datasteramer
> Wait while data loaded.
> 2. Start one more node. It will FAILED (!!!!) to join topology.
> And in VisualVM I see that first node consuming 25% of CPU and on
> sampler page I see that first node consume CPU in following methods:
> GridCacheMapEntry.deletedUnlocked()
> GridCacheMapEntry.checkExpired()
>
> Other scenario.
>
> 1. Start couple of nodes without load.
> 2. Start node with load. In this case all nodes joined topology.
> 3. After load is finished in VisulaVM observed CPU consumption in:
> GridCacheMapEntry.deletedUnlocked()
> GridCacheMapEntry.checkExpired()
>
> The more entries will be loaded in cache 2M, 3M than more CPU will be
> consumed.
>
>
> --
> Alexey Kuznetsov
> GridGain Systems
> www.gridgain.com
>

dsetrakyan

Re: Significant CPU and memory consumption

On Thu, May 7, 2015 at 1:21 AM, Alexey Goncharuk <[hidden email]
> wrote:

> We narrowed down the root cause of this issue, it is caused by discovery
> thread iterating over the whole cache when collecting cache metrics. We
> fixed isEmpty method and this case looks ok now. Alexey will verify the
> rest of the tests are good. I think we should revoke the vote since this is
> a critical issue.
>

I agree. I will cancel the vote. Let's fix this issue and resubmit the vote
when we are ready. Thanks to everyone for looking into this on such a short
notice!

> 2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:
>
> > Steps to reproduce.
> >
> > 1. Start node with partitioned cache and load to cache > 1M indexed
> > entries.
> > In my case I used datasteramer
> > Wait while data loaded.
> > 2. Start one more node. It will FAILED (!!!!) to join topology.
> > And in VisualVM I see that first node consuming 25% of CPU and on
> > sampler page I see that first node consume CPU in following methods:
> > GridCacheMapEntry.deletedUnlocked()
> > GridCacheMapEntry.checkExpired()
> >
> > Other scenario.
> >
> > 1. Start couple of nodes without load.
> > 2. Start node with load. In this case all nodes joined topology.
> > 3. After load is finished in VisulaVM observed CPU consumption in:
> > GridCacheMapEntry.deletedUnlocked()
> > GridCacheMapEntry.checkExpired()
> >
> > The more entries will be loaded in cache 2M, 3M than more CPU will be
> > consumed.
> >
> >
> > --
> > Alexey Kuznetsov
> > GridGain Systems
> > www.gridgain.com
> >
>

Konstantin Boudnik-2

Re: Significant CPU and memory consumption

In reply to this post by Alexey Goncharuk

Great and timely discovery! let's re-vote once it's fixed.

Thanks!

On Wed, May 06, 2015 at 11:21PM, Alexey Goncharuk wrote:

> We narrowed down the root cause of this issue, it is caused by discovery
> thread iterating over the whole cache when collecting cache metrics. We
> fixed isEmpty method and this case looks ok now. Alexey will verify the
> rest of the tests are good. I think we should revoke the vote since this is
> a critical issue.
>
> 2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:
>
> > Steps to reproduce.
> >
> > 1. Start node with partitioned cache and load to cache > 1M indexed
> > entries.
> > In my case I used datasteramer
> > Wait while data loaded.
> > 2. Start one more node. It will FAILED (!!!!) to join topology.
> > And in VisualVM I see that first node consuming 25% of CPU and on
> > sampler page I see that first node consume CPU in following methods:
> > GridCacheMapEntry.deletedUnlocked()
> > GridCacheMapEntry.checkExpired()
> >
> > Other scenario.
> >
> > 1. Start couple of nodes without load.
> > 2. Start node with load. In this case all nodes joined topology.
> > 3. After load is finished in VisulaVM observed CPU consumption in:
> > GridCacheMapEntry.deletedUnlocked()
> > GridCacheMapEntry.checkExpired()
> >
> > The more entries will be loaded in cache 2M, 3M than more CPU will be
> > consumed.
> >
> >
> > --
> > Alexey Kuznetsov
> > GridGain Systems
> > www.gridgain.com
> >