Significant CPU and memory consumption

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Significant CPU and memory consumption

Alexey Kuznetsov-2
Steps to reproduce.

1. Start node with partitioned cache and load to cache > 1M indexed entries.
    In my case I used datasteramer
    Wait while data loaded.
2. Start one more node. It will FAILED (!!!!) to join topology.
    And in VisualVM I see that first node consuming 25% of CPU and on
sampler page I see that first node consume CPU in following methods:
       GridCacheMapEntry.deletedUnlocked()
       GridCacheMapEntry.checkExpired()

Other scenario.

1. Start couple of nodes without load.
2. Start node with load. In this case all nodes joined topology.
3. After load is finished in VisulaVM observed CPU consumption in:
       GridCacheMapEntry.deletedUnlocked()
       GridCacheMapEntry.checkExpired()

The more entries will be loaded in cache 2M, 3M than more CPU will be
consumed.


--
Alexey Kuznetsov
GridGain Systems
www.gridgain.com
Reply | Threaded
Open this post in threaded view
|

Re: Significant CPU and memory consumption

Alexey Goncharuk
We narrowed down the root cause of this issue, it is caused by discovery
thread iterating over the whole cache when collecting cache metrics. We
fixed isEmpty method and this case looks ok now. Alexey will verify the
rest of the tests are good. I think we should revoke the vote since this is
a critical issue.

2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:

> Steps to reproduce.
>
> 1. Start node with partitioned cache and load to cache > 1M indexed
> entries.
>     In my case I used datasteramer
>     Wait while data loaded.
> 2. Start one more node. It will FAILED (!!!!) to join topology.
>     And in VisualVM I see that first node consuming 25% of CPU and on
> sampler page I see that first node consume CPU in following methods:
>        GridCacheMapEntry.deletedUnlocked()
>        GridCacheMapEntry.checkExpired()
>
> Other scenario.
>
> 1. Start couple of nodes without load.
> 2. Start node with load. In this case all nodes joined topology.
> 3. After load is finished in VisulaVM observed CPU consumption in:
>        GridCacheMapEntry.deletedUnlocked()
>        GridCacheMapEntry.checkExpired()
>
> The more entries will be loaded in cache 2M, 3M than more CPU will be
> consumed.
>
>
> --
> Alexey Kuznetsov
> GridGain Systems
> www.gridgain.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Significant CPU and memory consumption

dsetrakyan
On Thu, May 7, 2015 at 1:21 AM, Alexey Goncharuk <[hidden email]
> wrote:

> We narrowed down the root cause of this issue, it is caused by discovery
> thread iterating over the whole cache when collecting cache metrics. We
> fixed isEmpty method and this case looks ok now. Alexey will verify the
> rest of the tests are good. I think we should revoke the vote since this is
> a critical issue.
>

I agree. I will cancel the vote. Let's fix this issue and resubmit the vote
when we are ready. Thanks to everyone for looking into this on such a short
notice!


> 2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:
>
> > Steps to reproduce.
> >
> > 1. Start node with partitioned cache and load to cache > 1M indexed
> > entries.
> >     In my case I used datasteramer
> >     Wait while data loaded.
> > 2. Start one more node. It will FAILED (!!!!) to join topology.
> >     And in VisualVM I see that first node consuming 25% of CPU and on
> > sampler page I see that first node consume CPU in following methods:
> >        GridCacheMapEntry.deletedUnlocked()
> >        GridCacheMapEntry.checkExpired()
> >
> > Other scenario.
> >
> > 1. Start couple of nodes without load.
> > 2. Start node with load. In this case all nodes joined topology.
> > 3. After load is finished in VisulaVM observed CPU consumption in:
> >        GridCacheMapEntry.deletedUnlocked()
> >        GridCacheMapEntry.checkExpired()
> >
> > The more entries will be loaded in cache 2M, 3M than more CPU will be
> > consumed.
> >
> >
> > --
> > Alexey Kuznetsov
> > GridGain Systems
> > www.gridgain.com
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Significant CPU and memory consumption

Konstantin Boudnik-2
In reply to this post by Alexey Goncharuk
Great and timely discovery! let's re-vote once it's fixed.

Thanks!

On Wed, May 06, 2015 at 11:21PM, Alexey Goncharuk wrote:

> We narrowed down the root cause of this issue, it is caused by discovery
> thread iterating over the whole cache when collecting cache metrics. We
> fixed isEmpty method and this case looks ok now. Alexey will verify the
> rest of the tests are good. I think we should revoke the vote since this is
> a critical issue.
>
> 2015-05-06 21:50 GMT-07:00 Alexey Kuznetsov <[hidden email]>:
>
> > Steps to reproduce.
> >
> > 1. Start node with partitioned cache and load to cache > 1M indexed
> > entries.
> >     In my case I used datasteramer
> >     Wait while data loaded.
> > 2. Start one more node. It will FAILED (!!!!) to join topology.
> >     And in VisualVM I see that first node consuming 25% of CPU and on
> > sampler page I see that first node consume CPU in following methods:
> >        GridCacheMapEntry.deletedUnlocked()
> >        GridCacheMapEntry.checkExpired()
> >
> > Other scenario.
> >
> > 1. Start couple of nodes without load.
> > 2. Start node with load. In this case all nodes joined topology.
> > 3. After load is finished in VisulaVM observed CPU consumption in:
> >        GridCacheMapEntry.deletedUnlocked()
> >        GridCacheMapEntry.checkExpired()
> >
> > The more entries will be loaded in cache 2M, 3M than more CPU will be
> > consumed.
> >
> >
> > --
> > Alexey Kuznetsov
> > GridGain Systems
> > www.gridgain.com
> >