[IGNITE-5717] improvements of MemoryPolicy default size

classic Classic list List threaded Threaded
34 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

dmagda
Vladimir, Dmitriy P.,

Please see inline

> On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov <[hidden email]> wrote:
>
> Denis,
>
> The reason is that product should not hang user's computer. How else this
> could be explained? I am developer. I start Ignite, 1 node, 2 nodes, X
> nodes, observe how they join topology. Add one key, 10 keys, 1M keys. Then
> I do a bug in example and load 100M keys accidentally - restart the
> computer. Correct behavior is to have small "maxMemory" by default to avoid
> that. User should get exception instead of hang. E.g. Java's "-Xmx" is
> typically 25% of RAM - more adequate value, comparing to Ignite.
>

Right, the developer was educated about the Java heap parameters and limited the overall space preferring OOM to the laptop suspension. Who knows how he got to the point that 25% RAM should be used. That might have been deep knowledge about JVM or he faced several hangs while testing the application.

Anyway, JVM creators didn’t decide to predefine the Java heap to a static value to avoid the situations like above. So should not we as a platform. Educate people about the Ignite memory behavior like Sun did for the Java heap but do not try to solve the lack of knowledge with the default static memory size.


> It doesn't matter whether you use persistence or not. Persistent case just
> makes this flaw more obvious - you have virtually unlimited disk, and yet
> you end up with swapping and hang when using Ignite with default
> configuration. As already explained, the problem is not about allocating
> "maxMemory" right away, but about the value of "maxMemory" - it is too big.
>

How do you know what should be the default then? Why 1 GB? For instance, if I end up having only 1 GB of free memory left and try to start 2 server nodes and an application I will face the laptop suspension again.


Denis

> "We had this behavior before" is never an argument. Previous offheap
> implementation had a lot of flaws, so let's just forget about it.
>
> On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]> wrote:
>
>> Sergey,
>>
>> That’s expectable because as we revealed from this discussion the
>> allocation works different depending on whether the persistence is used or
>> not:
>>
>> 1) In-memory mode (the persistence is disabled) - the space will be
>> allocated incrementally until the max threshold is reached. Good!
>>
>> 2) The persistence mode - the whole space (limited by the max threshold)
>> is allocated right away. It’s not surprising that your laptop starts
>> choking.
>>
>> So, in my previous response I tried to explain that I can’t find any
>> reason why we should adjust 1). Any reasons except for the massive
>> preloading?
>>
>> As for 2), that was a big surprise to reveal this after 2.1 release.
>> Definitely we have to fix this somehow.
>>
>> —
>> Denis
>>
>>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <[hidden email]>
>> wrote:
>>>
>>> Denis,
>>>
>>> Just a simple example from our own codebase: I tried to execute
>>> PersistentStoreExample with default settings and two server nodes and
>>> client node got frozen even on initial load of data into the grid.
>>> Although with one server node the example finishes pretty quickly.
>>>
>>> And my laptop isn't the weakest one and has 16 gigs of memory, but it
>>> cannot deal with it.
>>>
>>>
>>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda <[hidden email]> wrote:
>>>
>>>>> As far as allocating 80% of available RAM - I was against this even for
>>>>> In-memory mode and still think that this is a wrong default. Looking at
>>>>> free RAM is even worse because it gives you undefined behavior.
>>>>
>>>> Guys, I can not understand how this dynamic memory allocation's
>> high-level
>>>> behavior (with the persistence DISABLED) is different from the legacy
>>>> off-heap memory we had in 1.x. Both off-heap memories allocate the
>> space on
>>>> demand, the current just does this more aggressively requesting big
>> chunks.
>>>>
>>>> Next, the legacy one was unlimited by default and the user can start as
>>>> many nodes as he wanted on a laptop and preload as much data as he
>> needed.
>>>> Sure he could bring down the laptop if too many entries were injected
>> into
>>>> the local cluster. But that’s about too massive preloading and not
>> caused
>>>> by the ability of the legacy off-heap memory to grow infinitely. The
>> same
>>>> preloading would cause a hang if the Java heap memory mode is used.
>>>>
>>>> The upshot is that the massive preloading of data on the local laptop
>>>> should not fixed with repealing of the dynamic memory allocation.
>>>> Is there any other reason why we have to use the static memory
>> allocation
>>>> for the case when the persistence is disabled? I think the case with the
>>>> persistence should be reviewed separately.
>>>>
>>>> —
>>>> Denis
>>>>
>>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>>>> [hidden email]> wrote:
>>>>>
>>>>> Dmitriy,
>>>>>
>>>>> The reason behind this is the need to to be able to evict and load
>> pages
>>>> to
>>>>> disk, thus we need to preserve a PageId->Pointer mapping in memory. In
>>>>> order to do this in the most efficient way, we need to know in advance
>>>> all
>>>>> the address ranges we work with. We can add dynamic memory extension
>> for
>>>>> persistence-enabled config, but this will add yet another step of
>>>>> indirection when resolving every page address, which adds a noticeable
>>>>> performance penalty.
>>>>>
>>>>>
>>>>>
>>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>>>>>
>>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <[hidden email]
>>>
>>>>>> wrote:
>>>>>>
>>>>>>> Dima,
>>>>>>>
>>>>>>> Probably folks who worked closely with storage know why.
>>>>>>>
>>>>>>
>>>>>> Without knowing why, how can we make a decision?
>>>>>>
>>>>>> Alexey Goncharuk, was it you who made the decision about not using
>>>>>> increments? Do know remember what was the reason?
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> The very problem is that before being started once on production
>>>>>>> environment, Ignite will typically be started hundred times on
>>>>>> developer's
>>>>>>> environment. I think that default should be ~10% of total RAM.
>>>>>>>
>>>>>>
>>>>>> Why not 80% of *free *RAM?
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
>>>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> [hidden email]
>>>>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Please see original Sergey's message - when persistence is enabled,
>>>>>>>> memory
>>>>>>>>> is not allocated incrementally, maxSize is used.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Why?
>>>>>>>>
>>>>>>>>
>>>>>>>>> Default settings must allow for normal work on developer's
>>>>>> environment.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Agree, but why not in increments?
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda <[hidden email]>:
>>>>>>>>>
>>>>>>>>>>> Why not allocate in increments automatically?
>>>>>>>>>>
>>>>>>>>>> This is exactly how the allocation works right now. The memory
>> will
>>>>>>>> grow
>>>>>>>>>> incrementally until the max size is reached (80% of RAM by
>>>>>> default).
>>>>>>>>>>
>>>>>>>>>> —
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email] wrote:
>>>>>>>>>>>
>>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i do not want
>>>>>>> t o
>>>>>>>>>> guess. Why not allocate in increments automatically?
>>>>>>>>>>>
>>>>>>>>>>> ⁣D.​
>>>>>>>>>>>
>>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir Ozerov <
>>>>>>>>>> [hidden email]> wrote:
>>>>>>>>>>>> Denis,
>>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with persistence,
>>>>>>> when
>>>>>>>>>>>> 80% of
>>>>>>>>>>>> RAM is allocated right away, was released several days ago. How
>>>>>> do
>>>>>>>> you
>>>>>>>>>>>> think, how many users tried it already?
>>>>>>>>>>>>
>>>>>>>>>>>> Guys,
>>>>>>>>>>>> Do you really think allocating 80% of available RAM is a normal
>>>>>>>> thing?
>>>>>>>>>>>> Take
>>>>>>>>>>>> your laptop and check how many available RAM you have right now.
>>>>>>> Do
>>>>>>>>> you
>>>>>>>>>>>> fit
>>>>>>>>>>>> to remaining 20%? If not, then running AI with persistence with
>>>>>>> all
>>>>>>>>>>>> defaults will bring your machine down. This is insane. We shold
>>>>>>>>>>>> allocate no
>>>>>>>>>>>> more than 1Gb, so that user can play with it without any
>>>>>> problems.
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <[hidden email]
>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that 80% is too
>>>>>>>>>>>> aggressive
>>>>>>>>>>>>> to bring it down.
>>>>>>>>>>>>>
>>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80% RAM
>>>>>>> allocation
>>>>>>>> on
>>>>>>>>>>>> 64
>>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM. I’ve not
>>>>>>> heard
>>>>>>>> of
>>>>>>>>>>>> any
>>>>>>>>>>>>> other complaints in regards the default allocation size.
>>>>>>>>>>>>>
>>>>>>>>>>>>> —
>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email] wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I prefer option #1.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> ⁣D.​
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey Chugunov <
>>>>>>>>>>>>> [hidden email]> wrote:
>>>>>>>>>>>>>>> Folks,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would like to get back to the question about MemoryPolicy
>>>>>>>>>>>> maxMemory
>>>>>>>>>>>>>>> defaults.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Although MemoryPolicy may be configured with initial and
>>>>>>>> maxMemory
>>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy always
>>>>>>> allocates
>>>>>>>>>>>>>>> maxMemory
>>>>>>>>>>>>>>> size for performance reasons.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical memory it
>>>>>>> causes
>>>>>>>>>>>> OOME
>>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or JVM level)
>>>>>> and
>>>>>>>>>>>> hurts
>>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes are started
>>>>>> on
>>>>>>>>>>>> the
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>> physical server.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to other
>>>>>>> options:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and adapt defaults.
>>>>>>> In
>>>>>>>>>>>> this
>>>>>>>>>>>>>>> case we still need to address the issue with multiple nodes
>>>>>> on
>>>>>>>> one
>>>>>>>>>>>>>>> machine
>>>>>>>>>>>>>>> even on 64 bit systems.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for instance,
>>>>>>>>>>>> max(0.3 *
>>>>>>>>>>>>>>> availableMemory, 1Gb).
>>>>>>>>>>>>>>> This option allows us to solve all issues with starting on 32
>>>>>>> bit
>>>>>>>>>>>>>>> platforms and reduce instability with multiple nodes on the
>>>>>>> same
>>>>>>>>>>>>>>> machine.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thoughts and/or other options?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Sergey.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Vladimir Ozerov
My proposal is 10% instead of 80%.

ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:

> Vladimir, Dmitriy P.,
>
> Please see inline
>
> > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov <[hidden email]>
> wrote:
> >
> > Denis,
> >
> > The reason is that product should not hang user's computer. How else this
> > could be explained? I am developer. I start Ignite, 1 node, 2 nodes, X
> > nodes, observe how they join topology. Add one key, 10 keys, 1M keys.
> Then
> > I do a bug in example and load 100M keys accidentally - restart the
> > computer. Correct behavior is to have small "maxMemory" by default to
> avoid
> > that. User should get exception instead of hang. E.g. Java's "-Xmx" is
> > typically 25% of RAM - more adequate value, comparing to Ignite.
> >
>
> Right, the developer was educated about the Java heap parameters and
> limited the overall space preferring OOM to the laptop suspension. Who
> knows how he got to the point that 25% RAM should be used. That might have
> been deep knowledge about JVM or he faced several hangs while testing the
> application.
>
> Anyway, JVM creators didn’t decide to predefine the Java heap to a static
> value to avoid the situations like above. So should not we as a platform.
> Educate people about the Ignite memory behavior like Sun did for the Java
> heap but do not try to solve the lack of knowledge with the default static
> memory size.
>
>
> > It doesn't matter whether you use persistence or not. Persistent case
> just
> > makes this flaw more obvious - you have virtually unlimited disk, and yet
> > you end up with swapping and hang when using Ignite with default
> > configuration. As already explained, the problem is not about allocating
> > "maxMemory" right away, but about the value of "maxMemory" - it is too
> big.
> >
>
> How do you know what should be the default then? Why 1 GB? For instance,
> if I end up having only 1 GB of free memory left and try to start 2 server
> nodes and an application I will face the laptop suspension again.
>
> —
> Denis
>
> > "We had this behavior before" is never an argument. Previous offheap
> > implementation had a lot of flaws, so let's just forget about it.
> >
> > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]> wrote:
> >
> >> Sergey,
> >>
> >> That’s expectable because as we revealed from this discussion the
> >> allocation works different depending on whether the persistence is used
> or
> >> not:
> >>
> >> 1) In-memory mode (the persistence is disabled) - the space will be
> >> allocated incrementally until the max threshold is reached. Good!
> >>
> >> 2) The persistence mode - the whole space (limited by the max threshold)
> >> is allocated right away. It’s not surprising that your laptop starts
> >> choking.
> >>
> >> So, in my previous response I tried to explain that I can’t find any
> >> reason why we should adjust 1). Any reasons except for the massive
> >> preloading?
> >>
> >> As for 2), that was a big surprise to reveal this after 2.1 release.
> >> Definitely we have to fix this somehow.
> >>
> >> —
> >> Denis
> >>
> >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <[hidden email]
> >
> >> wrote:
> >>>
> >>> Denis,
> >>>
> >>> Just a simple example from our own codebase: I tried to execute
> >>> PersistentStoreExample with default settings and two server nodes and
> >>> client node got frozen even on initial load of data into the grid.
> >>> Although with one server node the example finishes pretty quickly.
> >>>
> >>> And my laptop isn't the weakest one and has 16 gigs of memory, but it
> >>> cannot deal with it.
> >>>
> >>>
> >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda <[hidden email]> wrote:
> >>>
> >>>>> As far as allocating 80% of available RAM - I was against this even
> for
> >>>>> In-memory mode and still think that this is a wrong default. Looking
> at
> >>>>> free RAM is even worse because it gives you undefined behavior.
> >>>>
> >>>> Guys, I can not understand how this dynamic memory allocation's
> >> high-level
> >>>> behavior (with the persistence DISABLED) is different from the legacy
> >>>> off-heap memory we had in 1.x. Both off-heap memories allocate the
> >> space on
> >>>> demand, the current just does this more aggressively requesting big
> >> chunks.
> >>>>
> >>>> Next, the legacy one was unlimited by default and the user can start
> as
> >>>> many nodes as he wanted on a laptop and preload as much data as he
> >> needed.
> >>>> Sure he could bring down the laptop if too many entries were injected
> >> into
> >>>> the local cluster. But that’s about too massive preloading and not
> >> caused
> >>>> by the ability of the legacy off-heap memory to grow infinitely. The
> >> same
> >>>> preloading would cause a hang if the Java heap memory mode is used.
> >>>>
> >>>> The upshot is that the massive preloading of data on the local laptop
> >>>> should not fixed with repealing of the dynamic memory allocation.
> >>>> Is there any other reason why we have to use the static memory
> >> allocation
> >>>> for the case when the persistence is disabled? I think the case with
> the
> >>>> persistence should be reviewed separately.
> >>>>
> >>>> —
> >>>> Denis
> >>>>
> >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> >>>> [hidden email]> wrote:
> >>>>>
> >>>>> Dmitriy,
> >>>>>
> >>>>> The reason behind this is the need to to be able to evict and load
> >> pages
> >>>> to
> >>>>> disk, thus we need to preserve a PageId->Pointer mapping in memory.
> In
> >>>>> order to do this in the most efficient way, we need to know in
> advance
> >>>> all
> >>>>> the address ranges we work with. We can add dynamic memory extension
> >> for
> >>>>> persistence-enabled config, but this will add yet another step of
> >>>>> indirection when resolving every page address, which adds a
> noticeable
> >>>>> performance penalty.
> >>>>>
> >>>>>
> >>>>>
> >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <[hidden email]
> >:
> >>>>>
> >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> [hidden email]
> >>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Dima,
> >>>>>>>
> >>>>>>> Probably folks who worked closely with storage know why.
> >>>>>>>
> >>>>>>
> >>>>>> Without knowing why, how can we make a decision?
> >>>>>>
> >>>>>> Alexey Goncharuk, was it you who made the decision about not using
> >>>>>> increments? Do know remember what was the reason?
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> The very problem is that before being started once on production
> >>>>>>> environment, Ignite will typically be started hundred times on
> >>>>>> developer's
> >>>>>>> environment. I think that default should be ~10% of total RAM.
> >>>>>>>
> >>>>>>
> >>>>>> Why not 80% of *free *RAM?
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
> >>>>>> [hidden email]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> >> [hidden email]
> >>>>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Please see original Sergey's message - when persistence is
> enabled,
> >>>>>>>> memory
> >>>>>>>>> is not allocated incrementally, maxSize is used.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Why?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> Default settings must allow for normal work on developer's
> >>>>>> environment.
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Agree, but why not in increments?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda <[hidden email]>:
> >>>>>>>>>
> >>>>>>>>>>> Why not allocate in increments automatically?
> >>>>>>>>>>
> >>>>>>>>>> This is exactly how the allocation works right now. The memory
> >> will
> >>>>>>>> grow
> >>>>>>>>>> incrementally until the max size is reached (80% of RAM by
> >>>>>> default).
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>>
> >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email] wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i do not
> want
> >>>>>>> t o
> >>>>>>>>>> guess. Why not allocate in increments automatically?
> >>>>>>>>>>>
> >>>>>>>>>>> ⁣D.​
> >>>>>>>>>>>
> >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir Ozerov <
> >>>>>>>>>> [hidden email]> wrote:
> >>>>>>>>>>>> Denis,
> >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with
> persistence,
> >>>>>>> when
> >>>>>>>>>>>> 80% of
> >>>>>>>>>>>> RAM is allocated right away, was released several days ago.
> How
> >>>>>> do
> >>>>>>>> you
> >>>>>>>>>>>> think, how many users tried it already?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Guys,
> >>>>>>>>>>>> Do you really think allocating 80% of available RAM is a
> normal
> >>>>>>>> thing?
> >>>>>>>>>>>> Take
> >>>>>>>>>>>> your laptop and check how many available RAM you have right
> now.
> >>>>>>> Do
> >>>>>>>>> you
> >>>>>>>>>>>> fit
> >>>>>>>>>>>> to remaining 20%? If not, then running AI with persistence
> with
> >>>>>>> all
> >>>>>>>>>>>> defaults will bring your machine down. This is insane. We
> shold
> >>>>>>>>>>>> allocate no
> >>>>>>>>>>>> more than 1Gb, so that user can play with it without any
> >>>>>> problems.
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> [hidden email]
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that 80% is too
> >>>>>>>>>>>> aggressive
> >>>>>>>>>>>>> to bring it down.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80% RAM
> >>>>>>> allocation
> >>>>>>>> on
> >>>>>>>>>>>> 64
> >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM. I’ve not
> >>>>>>> heard
> >>>>>>>> of
> >>>>>>>>>>>> any
> >>>>>>>>>>>>> other complaints in regards the default allocation size.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> —
> >>>>>>>>>>>>> Denis
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email] wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I prefer option #1.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> ⁣D.​
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey Chugunov <
> >>>>>>>>>>>>> [hidden email]> wrote:
> >>>>>>>>>>>>>>> Folks,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I would like to get back to the question about MemoryPolicy
> >>>>>>>>>>>> maxMemory
> >>>>>>>>>>>>>>> defaults.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with initial and
> >>>>>>>> maxMemory
> >>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy always
> >>>>>>> allocates
> >>>>>>>>>>>>>>> maxMemory
> >>>>>>>>>>>>>>> size for performance reasons.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical memory it
> >>>>>>> causes
> >>>>>>>>>>>> OOME
> >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or JVM level)
> >>>>>> and
> >>>>>>>>>>>> hurts
> >>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes are
> started
> >>>>>> on
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> same
> >>>>>>>>>>>>>>> physical server.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to other
> >>>>>>> options:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and adapt
> defaults.
> >>>>>>> In
> >>>>>>>>>>>> this
> >>>>>>>>>>>>>>> case we still need to address the issue with multiple nodes
> >>>>>> on
> >>>>>>>> one
> >>>>>>>>>>>>>>> machine
> >>>>>>>>>>>>>>> even on 64 bit systems.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for instance,
> >>>>>>>>>>>> max(0.3 *
> >>>>>>>>>>>>>>> availableMemory, 1Gb).
> >>>>>>>>>>>>>>> This option allows us to solve all issues with starting on
> 32
> >>>>>>> bit
> >>>>>>>>>>>>>>> platforms and reduce instability with multiple nodes on the
> >>>>>>> same
> >>>>>>>>>>>>>>> machine.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thoughts and/or other options?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>> Sergey.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>
> >>>>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

dsetrakyan
Here is what we should do:

   1. Pick an acceptable number. Does not matter if it is 10% or 50%.
   2. Print the allocated memory in *BOLD* letters into the log.
   3. Make sure that Ignite server never hangs due to the low memory issue.
   We should sense it and kick the node out automatically, again with a *BOLD*
   message in the log.

 Is this possible?

D.

On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov <[hidden email]>
wrote:

> My proposal is 10% instead of 80%.
>
> ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
>
> > Vladimir, Dmitriy P.,
> >
> > Please see inline
> >
> > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> > >
> > > Denis,
> > >
> > > The reason is that product should not hang user's computer. How else
> this
> > > could be explained? I am developer. I start Ignite, 1 node, 2 nodes, X
> > > nodes, observe how they join topology. Add one key, 10 keys, 1M keys.
> > Then
> > > I do a bug in example and load 100M keys accidentally - restart the
> > > computer. Correct behavior is to have small "maxMemory" by default to
> > avoid
> > > that. User should get exception instead of hang. E.g. Java's "-Xmx" is
> > > typically 25% of RAM - more adequate value, comparing to Ignite.
> > >
> >
> > Right, the developer was educated about the Java heap parameters and
> > limited the overall space preferring OOM to the laptop suspension. Who
> > knows how he got to the point that 25% RAM should be used. That might
> have
> > been deep knowledge about JVM or he faced several hangs while testing the
> > application.
> >
> > Anyway, JVM creators didn’t decide to predefine the Java heap to a static
> > value to avoid the situations like above. So should not we as a platform.
> > Educate people about the Ignite memory behavior like Sun did for the Java
> > heap but do not try to solve the lack of knowledge with the default
> static
> > memory size.
> >
> >
> > > It doesn't matter whether you use persistence or not. Persistent case
> > just
> > > makes this flaw more obvious - you have virtually unlimited disk, and
> yet
> > > you end up with swapping and hang when using Ignite with default
> > > configuration. As already explained, the problem is not about
> allocating
> > > "maxMemory" right away, but about the value of "maxMemory" - it is too
> > big.
> > >
> >
> > How do you know what should be the default then? Why 1 GB? For instance,
> > if I end up having only 1 GB of free memory left and try to start 2
> server
> > nodes and an application I will face the laptop suspension again.
> >
> > —
> > Denis
> >
> > > "We had this behavior before" is never an argument. Previous offheap
> > > implementation had a lot of flaws, so let's just forget about it.
> > >
> > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]> wrote:
> > >
> > >> Sergey,
> > >>
> > >> That’s expectable because as we revealed from this discussion the
> > >> allocation works different depending on whether the persistence is
> used
> > or
> > >> not:
> > >>
> > >> 1) In-memory mode (the persistence is disabled) - the space will be
> > >> allocated incrementally until the max threshold is reached. Good!
> > >>
> > >> 2) The persistence mode - the whole space (limited by the max
> threshold)
> > >> is allocated right away. It’s not surprising that your laptop starts
> > >> choking.
> > >>
> > >> So, in my previous response I tried to explain that I can’t find any
> > >> reason why we should adjust 1). Any reasons except for the massive
> > >> preloading?
> > >>
> > >> As for 2), that was a big surprise to reveal this after 2.1 release.
> > >> Definitely we have to fix this somehow.
> > >>
> > >> —
> > >> Denis
> > >>
> > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
> [hidden email]
> > >
> > >> wrote:
> > >>>
> > >>> Denis,
> > >>>
> > >>> Just a simple example from our own codebase: I tried to execute
> > >>> PersistentStoreExample with default settings and two server nodes and
> > >>> client node got frozen even on initial load of data into the grid.
> > >>> Although with one server node the example finishes pretty quickly.
> > >>>
> > >>> And my laptop isn't the weakest one and has 16 gigs of memory, but it
> > >>> cannot deal with it.
> > >>>
> > >>>
> > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda <[hidden email]>
> wrote:
> > >>>
> > >>>>> As far as allocating 80% of available RAM - I was against this even
> > for
> > >>>>> In-memory mode and still think that this is a wrong default.
> Looking
> > at
> > >>>>> free RAM is even worse because it gives you undefined behavior.
> > >>>>
> > >>>> Guys, I can not understand how this dynamic memory allocation's
> > >> high-level
> > >>>> behavior (with the persistence DISABLED) is different from the
> legacy
> > >>>> off-heap memory we had in 1.x. Both off-heap memories allocate the
> > >> space on
> > >>>> demand, the current just does this more aggressively requesting big
> > >> chunks.
> > >>>>
> > >>>> Next, the legacy one was unlimited by default and the user can start
> > as
> > >>>> many nodes as he wanted on a laptop and preload as much data as he
> > >> needed.
> > >>>> Sure he could bring down the laptop if too many entries were
> injected
> > >> into
> > >>>> the local cluster. But that’s about too massive preloading and not
> > >> caused
> > >>>> by the ability of the legacy off-heap memory to grow infinitely. The
> > >> same
> > >>>> preloading would cause a hang if the Java heap memory mode is used.
> > >>>>
> > >>>> The upshot is that the massive preloading of data on the local
> laptop
> > >>>> should not fixed with repealing of the dynamic memory allocation.
> > >>>> Is there any other reason why we have to use the static memory
> > >> allocation
> > >>>> for the case when the persistence is disabled? I think the case with
> > the
> > >>>> persistence should be reviewed separately.
> > >>>>
> > >>>> —
> > >>>> Denis
> > >>>>
> > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> > >>>> [hidden email]> wrote:
> > >>>>>
> > >>>>> Dmitriy,
> > >>>>>
> > >>>>> The reason behind this is the need to to be able to evict and load
> > >> pages
> > >>>> to
> > >>>>> disk, thus we need to preserve a PageId->Pointer mapping in memory.
> > In
> > >>>>> order to do this in the most efficient way, we need to know in
> > advance
> > >>>> all
> > >>>>> the address ranges we work with. We can add dynamic memory
> extension
> > >> for
> > >>>>> persistence-enabled config, but this will add yet another step of
> > >>>>> indirection when resolving every page address, which adds a
> > noticeable
> > >>>>> performance penalty.
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
> [hidden email]
> > >:
> > >>>>>
> > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> > [hidden email]
> > >>>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Dima,
> > >>>>>>>
> > >>>>>>> Probably folks who worked closely with storage know why.
> > >>>>>>>
> > >>>>>>
> > >>>>>> Without knowing why, how can we make a decision?
> > >>>>>>
> > >>>>>> Alexey Goncharuk, was it you who made the decision about not using
> > >>>>>> increments? Do know remember what was the reason?
> > >>>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>> The very problem is that before being started once on production
> > >>>>>>> environment, Ignite will typically be started hundred times on
> > >>>>>> developer's
> > >>>>>>> environment. I think that default should be ~10% of total RAM.
> > >>>>>>>
> > >>>>>>
> > >>>>>> Why not 80% of *free *RAM?
> > >>>>>>
> > >>>>>>
> > >>>>>>>
> > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
> > >>>>>> [hidden email]>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> > >> [hidden email]
> > >>>>>
> > >>>>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Please see original Sergey's message - when persistence is
> > enabled,
> > >>>>>>>> memory
> > >>>>>>>>> is not allocated incrementally, maxSize is used.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Why?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> Default settings must allow for normal work on developer's
> > >>>>>> environment.
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Agree, but why not in increments?
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda <[hidden email]>:
> > >>>>>>>>>
> > >>>>>>>>>>> Why not allocate in increments automatically?
> > >>>>>>>>>>
> > >>>>>>>>>> This is exactly how the allocation works right now. The memory
> > >> will
> > >>>>>>>> grow
> > >>>>>>>>>> incrementally until the max size is reached (80% of RAM by
> > >>>>>> default).
> > >>>>>>>>>>
> > >>>>>>>>>> —
> > >>>>>>>>>> Denis
> > >>>>>>>>>>
> > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email] wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i do not
> > want
> > >>>>>>> t o
> > >>>>>>>>>> guess. Why not allocate in increments automatically?
> > >>>>>>>>>>>
> > >>>>>>>>>>> ⁣D.​
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir Ozerov <
> > >>>>>>>>>> [hidden email]> wrote:
> > >>>>>>>>>>>> Denis,
> > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with
> > persistence,
> > >>>>>>> when
> > >>>>>>>>>>>> 80% of
> > >>>>>>>>>>>> RAM is allocated right away, was released several days ago.
> > How
> > >>>>>> do
> > >>>>>>>> you
> > >>>>>>>>>>>> think, how many users tried it already?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Guys,
> > >>>>>>>>>>>> Do you really think allocating 80% of available RAM is a
> > normal
> > >>>>>>>> thing?
> > >>>>>>>>>>>> Take
> > >>>>>>>>>>>> your laptop and check how many available RAM you have right
> > now.
> > >>>>>>> Do
> > >>>>>>>>> you
> > >>>>>>>>>>>> fit
> > >>>>>>>>>>>> to remaining 20%? If not, then running AI with persistence
> > with
> > >>>>>>> all
> > >>>>>>>>>>>> defaults will bring your machine down. This is insane. We
> > shold
> > >>>>>>>>>>>> allocate no
> > >>>>>>>>>>>> more than 1Gb, so that user can play with it without any
> > >>>>>> problems.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> > [hidden email]
> > >>>>>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that 80% is
> too
> > >>>>>>>>>>>> aggressive
> > >>>>>>>>>>>>> to bring it down.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80% RAM
> > >>>>>>> allocation
> > >>>>>>>> on
> > >>>>>>>>>>>> 64
> > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM. I’ve
> not
> > >>>>>>> heard
> > >>>>>>>> of
> > >>>>>>>>>>>> any
> > >>>>>>>>>>>>> other complaints in regards the default allocation size.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> —
> > >>>>>>>>>>>>> Denis
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email] wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I prefer option #1.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> ⁣D.​
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey Chugunov <
> > >>>>>>>>>>>>> [hidden email]> wrote:
> > >>>>>>>>>>>>>>> Folks,
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I would like to get back to the question about
> MemoryPolicy
> > >>>>>>>>>>>> maxMemory
> > >>>>>>>>>>>>>>> defaults.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with initial and
> > >>>>>>>> maxMemory
> > >>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy always
> > >>>>>>> allocates
> > >>>>>>>>>>>>>>> maxMemory
> > >>>>>>>>>>>>>>> size for performance reasons.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical memory it
> > >>>>>>> causes
> > >>>>>>>>>>>> OOME
> > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or JVM
> level)
> > >>>>>> and
> > >>>>>>>>>>>> hurts
> > >>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes are
> > started
> > >>>>>> on
> > >>>>>>>>>>>> the
> > >>>>>>>>>>>>>>> same
> > >>>>>>>>>>>>>>> physical server.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to other
> > >>>>>>> options:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and adapt
> > defaults.
> > >>>>>>> In
> > >>>>>>>>>>>> this
> > >>>>>>>>>>>>>>> case we still need to address the issue with multiple
> nodes
> > >>>>>> on
> > >>>>>>>> one
> > >>>>>>>>>>>>>>> machine
> > >>>>>>>>>>>>>>> even on 64 bit systems.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for
> instance,
> > >>>>>>>>>>>> max(0.3 *
> > >>>>>>>>>>>>>>> availableMemory, 1Gb).
> > >>>>>>>>>>>>>>> This option allows us to solve all issues with starting
> on
> > 32
> > >>>>>>> bit
> > >>>>>>>>>>>>>>> platforms and reduce instability with multiple nodes on
> the
> > >>>>>>> same
> > >>>>>>>>>>>>>>> machine.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thoughts and/or other options?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>> Sergey.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>
> > >>>>
> > >>
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Sergey Chugunov
Dmitriy,

Last item makes perfect sense to me, one may think of it as an
"OutOfMemoryException" in java.
However, it looks like such feature requires considerable efforts to
properly design and implement it, so I would propose to create a separate
ticket and agree upon target version for it.

Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?

Thanks,
Sergey.

On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Here is what we should do:
>
>    1. Pick an acceptable number. Does not matter if it is 10% or 50%.
>    2. Print the allocated memory in *BOLD* letters into the log.
>    3. Make sure that Ignite server never hangs due to the low memory issue.
>    We should sense it and kick the node out automatically, again with a
> *BOLD*
>    message in the log.
>
>  Is this possible?
>
> D.
>
> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > My proposal is 10% instead of 80%.
> >
> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
> >
> > > Vladimir, Dmitriy P.,
> > >
> > > Please see inline
> > >
> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov <[hidden email]>
> > > wrote:
> > > >
> > > > Denis,
> > > >
> > > > The reason is that product should not hang user's computer. How else
> > this
> > > > could be explained? I am developer. I start Ignite, 1 node, 2 nodes,
> X
> > > > nodes, observe how they join topology. Add one key, 10 keys, 1M keys.
> > > Then
> > > > I do a bug in example and load 100M keys accidentally - restart the
> > > > computer. Correct behavior is to have small "maxMemory" by default to
> > > avoid
> > > > that. User should get exception instead of hang. E.g. Java's "-Xmx"
> is
> > > > typically 25% of RAM - more adequate value, comparing to Ignite.
> > > >
> > >
> > > Right, the developer was educated about the Java heap parameters and
> > > limited the overall space preferring OOM to the laptop suspension. Who
> > > knows how he got to the point that 25% RAM should be used. That might
> > have
> > > been deep knowledge about JVM or he faced several hangs while testing
> the
> > > application.
> > >
> > > Anyway, JVM creators didn’t decide to predefine the Java heap to a
> static
> > > value to avoid the situations like above. So should not we as a
> platform.
> > > Educate people about the Ignite memory behavior like Sun did for the
> Java
> > > heap but do not try to solve the lack of knowledge with the default
> > static
> > > memory size.
> > >
> > >
> > > > It doesn't matter whether you use persistence or not. Persistent case
> > > just
> > > > makes this flaw more obvious - you have virtually unlimited disk, and
> > yet
> > > > you end up with swapping and hang when using Ignite with default
> > > > configuration. As already explained, the problem is not about
> > allocating
> > > > "maxMemory" right away, but about the value of "maxMemory" - it is
> too
> > > big.
> > > >
> > >
> > > How do you know what should be the default then? Why 1 GB? For
> instance,
> > > if I end up having only 1 GB of free memory left and try to start 2
> > server
> > > nodes and an application I will face the laptop suspension again.
> > >
> > > —
> > > Denis
> > >
> > > > "We had this behavior before" is never an argument. Previous offheap
> > > > implementation had a lot of flaws, so let's just forget about it.
> > > >
> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]>
> wrote:
> > > >
> > > >> Sergey,
> > > >>
> > > >> That’s expectable because as we revealed from this discussion the
> > > >> allocation works different depending on whether the persistence is
> > used
> > > or
> > > >> not:
> > > >>
> > > >> 1) In-memory mode (the persistence is disabled) - the space will be
> > > >> allocated incrementally until the max threshold is reached. Good!
> > > >>
> > > >> 2) The persistence mode - the whole space (limited by the max
> > threshold)
> > > >> is allocated right away. It’s not surprising that your laptop starts
> > > >> choking.
> > > >>
> > > >> So, in my previous response I tried to explain that I can’t find any
> > > >> reason why we should adjust 1). Any reasons except for the massive
> > > >> preloading?
> > > >>
> > > >> As for 2), that was a big surprise to reveal this after 2.1 release.
> > > >> Definitely we have to fix this somehow.
> > > >>
> > > >> —
> > > >> Denis
> > > >>
> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
> > [hidden email]
> > > >
> > > >> wrote:
> > > >>>
> > > >>> Denis,
> > > >>>
> > > >>> Just a simple example from our own codebase: I tried to execute
> > > >>> PersistentStoreExample with default settings and two server nodes
> and
> > > >>> client node got frozen even on initial load of data into the grid.
> > > >>> Although with one server node the example finishes pretty quickly.
> > > >>>
> > > >>> And my laptop isn't the weakest one and has 16 gigs of memory, but
> it
> > > >>> cannot deal with it.
> > > >>>
> > > >>>
> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda <[hidden email]>
> > wrote:
> > > >>>
> > > >>>>> As far as allocating 80% of available RAM - I was against this
> even
> > > for
> > > >>>>> In-memory mode and still think that this is a wrong default.
> > Looking
> > > at
> > > >>>>> free RAM is even worse because it gives you undefined behavior.
> > > >>>>
> > > >>>> Guys, I can not understand how this dynamic memory allocation's
> > > >> high-level
> > > >>>> behavior (with the persistence DISABLED) is different from the
> > legacy
> > > >>>> off-heap memory we had in 1.x. Both off-heap memories allocate the
> > > >> space on
> > > >>>> demand, the current just does this more aggressively requesting
> big
> > > >> chunks.
> > > >>>>
> > > >>>> Next, the legacy one was unlimited by default and the user can
> start
> > > as
> > > >>>> many nodes as he wanted on a laptop and preload as much data as he
> > > >> needed.
> > > >>>> Sure he could bring down the laptop if too many entries were
> > injected
> > > >> into
> > > >>>> the local cluster. But that’s about too massive preloading and not
> > > >> caused
> > > >>>> by the ability of the legacy off-heap memory to grow infinitely.
> The
> > > >> same
> > > >>>> preloading would cause a hang if the Java heap memory mode is
> used.
> > > >>>>
> > > >>>> The upshot is that the massive preloading of data on the local
> > laptop
> > > >>>> should not fixed with repealing of the dynamic memory allocation.
> > > >>>> Is there any other reason why we have to use the static memory
> > > >> allocation
> > > >>>> for the case when the persistence is disabled? I think the case
> with
> > > the
> > > >>>> persistence should be reviewed separately.
> > > >>>>
> > > >>>> —
> > > >>>> Denis
> > > >>>>
> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> > > >>>> [hidden email]> wrote:
> > > >>>>>
> > > >>>>> Dmitriy,
> > > >>>>>
> > > >>>>> The reason behind this is the need to to be able to evict and
> load
> > > >> pages
> > > >>>> to
> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping in
> memory.
> > > In
> > > >>>>> order to do this in the most efficient way, we need to know in
> > > advance
> > > >>>> all
> > > >>>>> the address ranges we work with. We can add dynamic memory
> > extension
> > > >> for
> > > >>>>> persistence-enabled config, but this will add yet another step of
> > > >>>>> indirection when resolving every page address, which adds a
> > > noticeable
> > > >>>>> performance penalty.
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
> > [hidden email]
> > > >:
> > > >>>>>
> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> > > [hidden email]
> > > >>>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Dima,
> > > >>>>>>>
> > > >>>>>>> Probably folks who worked closely with storage know why.
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> Without knowing why, how can we make a decision?
> > > >>>>>>
> > > >>>>>> Alexey Goncharuk, was it you who made the decision about not
> using
> > > >>>>>> increments? Do know remember what was the reason?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>>
> > > >>>>>>> The very problem is that before being started once on
> production
> > > >>>>>>> environment, Ignite will typically be started hundred times on
> > > >>>>>> developer's
> > > >>>>>>> environment. I think that default should be ~10% of total RAM.
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>> Why not 80% of *free *RAM?
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>>
> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
> > > >>>>>> [hidden email]>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> > > >> [hidden email]
> > > >>>>>
> > > >>>>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Please see original Sergey's message - when persistence is
> > > enabled,
> > > >>>>>>>> memory
> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Why?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>> Default settings must allow for normal work on developer's
> > > >>>>>> environment.
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Agree, but why not in increments?
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda <[hidden email]>:
> > > >>>>>>>>>
> > > >>>>>>>>>>> Why not allocate in increments automatically?
> > > >>>>>>>>>>
> > > >>>>>>>>>> This is exactly how the allocation works right now. The
> memory
> > > >> will
> > > >>>>>>>> grow
> > > >>>>>>>>>> incrementally until the max size is reached (80% of RAM by
> > > >>>>>> default).
> > > >>>>>>>>>>
> > > >>>>>>>>>> —
> > > >>>>>>>>>> Denis
> > > >>>>>>>>>>
> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email] wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i do
> not
> > > want
> > > >>>>>>> t o
> > > >>>>>>>>>> guess. Why not allocate in increments automatically?
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> ⁣D.​
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir Ozerov <
> > > >>>>>>>>>> [hidden email]> wrote:
> > > >>>>>>>>>>>> Denis,
> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with
> > > persistence,
> > > >>>>>>> when
> > > >>>>>>>>>>>> 80% of
> > > >>>>>>>>>>>> RAM is allocated right away, was released several days
> ago.
> > > How
> > > >>>>>> do
> > > >>>>>>>> you
> > > >>>>>>>>>>>> think, how many users tried it already?
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Guys,
> > > >>>>>>>>>>>> Do you really think allocating 80% of available RAM is a
> > > normal
> > > >>>>>>>> thing?
> > > >>>>>>>>>>>> Take
> > > >>>>>>>>>>>> your laptop and check how many available RAM you have
> right
> > > now.
> > > >>>>>>> Do
> > > >>>>>>>>> you
> > > >>>>>>>>>>>> fit
> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with persistence
> > > with
> > > >>>>>>> all
> > > >>>>>>>>>>>> defaults will bring your machine down. This is insane. We
> > > shold
> > > >>>>>>>>>>>> allocate no
> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it without any
> > > >>>>>> problems.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> > > [hidden email]
> > > >>>>>>>
> > > >>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that 80% is
> > too
> > > >>>>>>>>>>>> aggressive
> > > >>>>>>>>>>>>> to bring it down.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80% RAM
> > > >>>>>>> allocation
> > > >>>>>>>> on
> > > >>>>>>>>>>>> 64
> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM. I’ve
> > not
> > > >>>>>>> heard
> > > >>>>>>>> of
> > > >>>>>>>>>>>> any
> > > >>>>>>>>>>>>> other complaints in regards the default allocation size.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> —
> > > >>>>>>>>>>>>> Denis
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email]
> wrote:
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> I prefer option #1.
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> ⁣D.​
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey Chugunov <
> > > >>>>>>>>>>>>> [hidden email]> wrote:
> > > >>>>>>>>>>>>>>> Folks,
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I would like to get back to the question about
> > MemoryPolicy
> > > >>>>>>>>>>>> maxMemory
> > > >>>>>>>>>>>>>>> defaults.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with initial
> and
> > > >>>>>>>> maxMemory
> > > >>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy always
> > > >>>>>>> allocates
> > > >>>>>>>>>>>>>>> maxMemory
> > > >>>>>>>>>>>>>>> size for performance reasons.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical memory
> it
> > > >>>>>>> causes
> > > >>>>>>>>>>>> OOME
> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or JVM
> > level)
> > > >>>>>> and
> > > >>>>>>>>>>>> hurts
> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes are
> > > started
> > > >>>>>> on
> > > >>>>>>>>>>>> the
> > > >>>>>>>>>>>>>>> same
> > > >>>>>>>>>>>>>>> physical server.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to other
> > > >>>>>>> options:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and adapt
> > > defaults.
> > > >>>>>>> In
> > > >>>>>>>>>>>> this
> > > >>>>>>>>>>>>>>> case we still need to address the issue with multiple
> > nodes
> > > >>>>>> on
> > > >>>>>>>> one
> > > >>>>>>>>>>>>>>> machine
> > > >>>>>>>>>>>>>>> even on 64 bit systems.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for
> > instance,
> > > >>>>>>>>>>>> max(0.3 *
> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with starting
> > on
> > > 32
> > > >>>>>>> bit
> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple nodes on
> > the
> > > >>>>>>> same
> > > >>>>>>>>>>>>>>> machine.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Thanks,
> > > >>>>>>>>>>>>>>> Sergey.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

dsetrakyan
Without #3, the #1 and #2 make little sense.

Why is #3 so difficult?

⁣D.​

On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <[hidden email]> wrote:

>Dmitriy,
>
>Last item makes perfect sense to me, one may think of it as an
>"OutOfMemoryException" in java.
>However, it looks like such feature requires considerable efforts to
>properly design and implement it, so I would propose to create a
>separate
>ticket and agree upon target version for it.
>
>Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?
>
>Thanks,
>Sergey.
>
>On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
><[hidden email]>
>wrote:
>
>> Here is what we should do:
>>
>>    1. Pick an acceptable number. Does not matter if it is 10% or 50%.
>>    2. Print the allocated memory in *BOLD* letters into the log.
>>    3. Make sure that Ignite server never hangs due to the low memory
>issue.
>>    We should sense it and kick the node out automatically, again with
>a
>> *BOLD*
>>    message in the log.
>>
>>  Is this possible?
>>
>> D.
>>
>> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
><[hidden email]>
>> wrote:
>>
>> > My proposal is 10% instead of 80%.
>> >
>> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
>> >
>> > > Vladimir, Dmitriy P.,
>> > >
>> > > Please see inline
>> > >
>> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
><[hidden email]>
>> > > wrote:
>> > > >
>> > > > Denis,
>> > > >
>> > > > The reason is that product should not hang user's computer. How
>else
>> > this
>> > > > could be explained? I am developer. I start Ignite, 1 node, 2
>nodes,
>> X
>> > > > nodes, observe how they join topology. Add one key, 10 keys, 1M
>keys.
>> > > Then
>> > > > I do a bug in example and load 100M keys accidentally - restart
>the
>> > > > computer. Correct behavior is to have small "maxMemory" by
>default to
>> > > avoid
>> > > > that. User should get exception instead of hang. E.g. Java's
>"-Xmx"
>> is
>> > > > typically 25% of RAM - more adequate value, comparing to
>Ignite.
>> > > >
>> > >
>> > > Right, the developer was educated about the Java heap parameters
>and
>> > > limited the overall space preferring OOM to the laptop
>suspension. Who
>> > > knows how he got to the point that 25% RAM should be used. That
>might
>> > have
>> > > been deep knowledge about JVM or he faced several hangs while
>testing
>> the
>> > > application.
>> > >
>> > > Anyway, JVM creators didn’t decide to predefine the Java heap to
>a
>> static
>> > > value to avoid the situations like above. So should not we as a
>> platform.
>> > > Educate people about the Ignite memory behavior like Sun did for
>the
>> Java
>> > > heap but do not try to solve the lack of knowledge with the
>default
>> > static
>> > > memory size.
>> > >
>> > >
>> > > > It doesn't matter whether you use persistence or not.
>Persistent case
>> > > just
>> > > > makes this flaw more obvious - you have virtually unlimited
>disk, and
>> > yet
>> > > > you end up with swapping and hang when using Ignite with
>default
>> > > > configuration. As already explained, the problem is not about
>> > allocating
>> > > > "maxMemory" right away, but about the value of "maxMemory" - it
>is
>> too
>> > > big.
>> > > >
>> > >
>> > > How do you know what should be the default then? Why 1 GB? For
>> instance,
>> > > if I end up having only 1 GB of free memory left and try to start
>2
>> > server
>> > > nodes and an application I will face the laptop suspension again.
>> > >
>> > > —
>> > > Denis
>> > >
>> > > > "We had this behavior before" is never an argument. Previous
>offheap
>> > > > implementation had a lot of flaws, so let's just forget about
>it.
>> > > >
>> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]>
>> wrote:
>> > > >
>> > > >> Sergey,
>> > > >>
>> > > >> That’s expectable because as we revealed from this discussion
>the
>> > > >> allocation works different depending on whether the
>persistence is
>> > used
>> > > or
>> > > >> not:
>> > > >>
>> > > >> 1) In-memory mode (the persistence is disabled) - the space
>will be
>> > > >> allocated incrementally until the max threshold is reached.
>Good!
>> > > >>
>> > > >> 2) The persistence mode - the whole space (limited by the max
>> > threshold)
>> > > >> is allocated right away. It’s not surprising that your laptop
>starts
>> > > >> choking.
>> > > >>
>> > > >> So, in my previous response I tried to explain that I can’t
>find any
>> > > >> reason why we should adjust 1). Any reasons except for the
>massive
>> > > >> preloading?
>> > > >>
>> > > >> As for 2), that was a big surprise to reveal this after 2.1
>release.
>> > > >> Definitely we have to fix this somehow.
>> > > >>
>> > > >> —
>> > > >> Denis
>> > > >>
>> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
>> > [hidden email]
>> > > >
>> > > >> wrote:
>> > > >>>
>> > > >>> Denis,
>> > > >>>
>> > > >>> Just a simple example from our own codebase: I tried to
>execute
>> > > >>> PersistentStoreExample with default settings and two server
>nodes
>> and
>> > > >>> client node got frozen even on initial load of data into the
>grid.
>> > > >>> Although with one server node the example finishes pretty
>quickly.
>> > > >>>
>> > > >>> And my laptop isn't the weakest one and has 16 gigs of
>memory, but
>> it
>> > > >>> cannot deal with it.
>> > > >>>
>> > > >>>
>> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
><[hidden email]>
>> > wrote:
>> > > >>>
>> > > >>>>> As far as allocating 80% of available RAM - I was against
>this
>> even
>> > > for
>> > > >>>>> In-memory mode and still think that this is a wrong
>default.
>> > Looking
>> > > at
>> > > >>>>> free RAM is even worse because it gives you undefined
>behavior.
>> > > >>>>
>> > > >>>> Guys, I can not understand how this dynamic memory
>allocation's
>> > > >> high-level
>> > > >>>> behavior (with the persistence DISABLED) is different from
>the
>> > legacy
>> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
>allocate the
>> > > >> space on
>> > > >>>> demand, the current just does this more aggressively
>requesting
>> big
>> > > >> chunks.
>> > > >>>>
>> > > >>>> Next, the legacy one was unlimited by default and the user
>can
>> start
>> > > as
>> > > >>>> many nodes as he wanted on a laptop and preload as much data
>as he
>> > > >> needed.
>> > > >>>> Sure he could bring down the laptop if too many entries were
>> > injected
>> > > >> into
>> > > >>>> the local cluster. But that’s about too massive preloading
>and not
>> > > >> caused
>> > > >>>> by the ability of the legacy off-heap memory to grow
>infinitely.
>> The
>> > > >> same
>> > > >>>> preloading would cause a hang if the Java heap memory mode
>is
>> used.
>> > > >>>>
>> > > >>>> The upshot is that the massive preloading of data on the
>local
>> > laptop
>> > > >>>> should not fixed with repealing of the dynamic memory
>allocation.
>> > > >>>> Is there any other reason why we have to use the static
>memory
>> > > >> allocation
>> > > >>>> for the case when the persistence is disabled? I think the
>case
>> with
>> > > the
>> > > >>>> persistence should be reviewed separately.
>> > > >>>>
>> > > >>>> —
>> > > >>>> Denis
>> > > >>>>
>> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>> > > >>>> [hidden email]> wrote:
>> > > >>>>>
>> > > >>>>> Dmitriy,
>> > > >>>>>
>> > > >>>>> The reason behind this is the need to to be able to evict
>and
>> load
>> > > >> pages
>> > > >>>> to
>> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping in
>> memory.
>> > > In
>> > > >>>>> order to do this in the most efficient way, we need to know
>in
>> > > advance
>> > > >>>> all
>> > > >>>>> the address ranges we work with. We can add dynamic memory
>> > extension
>> > > >> for
>> > > >>>>> persistence-enabled config, but this will add yet another
>step of
>> > > >>>>> indirection when resolving every page address, which adds a
>> > > noticeable
>> > > >>>>> performance penalty.
>> > > >>>>>
>> > > >>>>>
>> > > >>>>>
>> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
>> > [hidden email]
>> > > >:
>> > > >>>>>
>> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
>> > > [hidden email]
>> > > >>>
>> > > >>>>>> wrote:
>> > > >>>>>>
>> > > >>>>>>> Dima,
>> > > >>>>>>>
>> > > >>>>>>> Probably folks who worked closely with storage know why.
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>> Without knowing why, how can we make a decision?
>> > > >>>>>>
>> > > >>>>>> Alexey Goncharuk, was it you who made the decision about
>not
>> using
>> > > >>>>>> increments? Do know remember what was the reason?
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>>>
>> > > >>>>>>> The very problem is that before being started once on
>> production
>> > > >>>>>>> environment, Ignite will typically be started hundred
>times on
>> > > >>>>>> developer's
>> > > >>>>>>> environment. I think that default should be ~10% of total
>RAM.
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>>> Why not 80% of *free *RAM?
>> > > >>>>>>
>> > > >>>>>>
>> > > >>>>>>>
>> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
>> > > >>>>>> [hidden email]>
>> > > >>>>>>> wrote:
>> > > >>>>>>>
>> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> > > >> [hidden email]
>> > > >>>>>
>> > > >>>>>>>> wrote:
>> > > >>>>>>>>
>> > > >>>>>>>>> Please see original Sergey's message - when persistence
>is
>> > > enabled,
>> > > >>>>>>>> memory
>> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
>> > > >>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>> Why?
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>> Default settings must allow for normal work on
>developer's
>> > > >>>>>> environment.
>> > > >>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>> Agree, but why not in increments?
>> > > >>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
><[hidden email]>:
>> > > >>>>>>>>>
>> > > >>>>>>>>>>> Why not allocate in increments automatically?
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> This is exactly how the allocation works right now.
>The
>> memory
>> > > >> will
>> > > >>>>>>>> grow
>> > > >>>>>>>>>> incrementally until the max size is reached (80% of
>RAM by
>> > > >>>>>> default).
>> > > >>>>>>>>>>
>> > > >>>>>>>>>> —
>> > > >>>>>>>>>> Denis
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email]
>wrote:
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i
>do
>> not
>> > > want
>> > > >>>>>>> t o
>> > > >>>>>>>>>> guess. Why not allocate in increments automatically?
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> ⁣D.​
>> > > >>>>>>>>>>>
>> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
>Ozerov <
>> > > >>>>>>>>>> [hidden email]> wrote:
>> > > >>>>>>>>>>>> Denis,
>> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with
>> > > persistence,
>> > > >>>>>>> when
>> > > >>>>>>>>>>>> 80% of
>> > > >>>>>>>>>>>> RAM is allocated right away, was released several
>days
>> ago.
>> > > How
>> > > >>>>>> do
>> > > >>>>>>>> you
>> > > >>>>>>>>>>>> think, how many users tried it already?
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> Guys,
>> > > >>>>>>>>>>>> Do you really think allocating 80% of available RAM
>is a
>> > > normal
>> > > >>>>>>>> thing?
>> > > >>>>>>>>>>>> Take
>> > > >>>>>>>>>>>> your laptop and check how many available RAM you
>have
>> right
>> > > now.
>> > > >>>>>>> Do
>> > > >>>>>>>>> you
>> > > >>>>>>>>>>>> fit
>> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
>persistence
>> > > with
>> > > >>>>>>> all
>> > > >>>>>>>>>>>> defaults will bring your machine down. This is
>insane. We
>> > > shold
>> > > >>>>>>>>>>>> allocate no
>> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it without
>any
>> > > >>>>>> problems.
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
>> > > [hidden email]
>> > > >>>>>>>
>> > > >>>>>>>>> wrote:
>> > > >>>>>>>>>>>>
>> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that
>80% is
>> > too
>> > > >>>>>>>>>>>> aggressive
>> > > >>>>>>>>>>>>> to bring it down.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80%
>RAM
>> > > >>>>>>> allocation
>> > > >>>>>>>> on
>> > > >>>>>>>>>>>> 64
>> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM.
>I’ve
>> > not
>> > > >>>>>>> heard
>> > > >>>>>>>> of
>> > > >>>>>>>>>>>> any
>> > > >>>>>>>>>>>>> other complaints in regards the default allocation
>size.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>> —
>> > > >>>>>>>>>>>>> Denis
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email]
>> wrote:
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> I prefer option #1.
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> ⁣D.​
>> > > >>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey
>Chugunov <
>> > > >>>>>>>>>>>>> [hidden email]> wrote:
>> > > >>>>>>>>>>>>>>> Folks,
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> I would like to get back to the question about
>> > MemoryPolicy
>> > > >>>>>>>>>>>> maxMemory
>> > > >>>>>>>>>>>>>>> defaults.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with
>initial
>> and
>> > > >>>>>>>> maxMemory
>> > > >>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy
>always
>> > > >>>>>>> allocates
>> > > >>>>>>>>>>>>>>> maxMemory
>> > > >>>>>>>>>>>>>>> size for performance reasons.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical
>memory
>> it
>> > > >>>>>>> causes
>> > > >>>>>>>>>>>> OOME
>> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or
>JVM
>> > level)
>> > > >>>>>> and
>> > > >>>>>>>>>>>> hurts
>> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes
>are
>> > > started
>> > > >>>>>> on
>> > > >>>>>>>>>>>> the
>> > > >>>>>>>>>>>>>>> same
>> > > >>>>>>>>>>>>>>> physical server.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to
>other
>> > > >>>>>>> options:
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and
>adapt
>> > > defaults.
>> > > >>>>>>> In
>> > > >>>>>>>>>>>> this
>> > > >>>>>>>>>>>>>>> case we still need to address the issue with
>multiple
>> > nodes
>> > > >>>>>> on
>> > > >>>>>>>> one
>> > > >>>>>>>>>>>>>>> machine
>> > > >>>>>>>>>>>>>>> even on 64 bit systems.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for
>> > instance,
>> > > >>>>>>>>>>>> max(0.3 *
>> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
>> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with
>starting
>> > on
>> > > 32
>> > > >>>>>>> bit
>> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple
>nodes on
>> > the
>> > > >>>>>>> same
>> > > >>>>>>>>>>>>>>> machine.
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
>> > > >>>>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>>> Thanks,
>> > > >>>>>>>>>>>>>>> Sergey.
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>>
>> > > >>>>>>>>>
>> > > >>>>>>>>
>> > > >>>>>>>
>> > > >>>>>>
>> > > >>>>
>> > > >>>>
>> > > >>
>> > > >>
>> > >
>> > >
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Sergey Chugunov
Do you see an obvious way of implementing it?

In java there is a heap and GC working on it. And for instance, it is
possible to make a decision to throw an OOM based on some gc metrics.

I may be wrong but I don't see a mechanism in Ignite to use it right away
for such purposes.
And implementing something without thorough planning brings huge risk of
false positives with nodes stopping when they don't have to.

That's why I think it must be implemented and intensively tested as part of
a separate ticket.

Thanks,
Sergey.

On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:

> Without #3, the #1 and #2 make little sense.
>
> Why is #3 so difficult?
>
> ⁣D.​
>
> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
> [hidden email]> wrote:
> >Dmitriy,
> >
> >Last item makes perfect sense to me, one may think of it as an
> >"OutOfMemoryException" in java.
> >However, it looks like such feature requires considerable efforts to
> >properly design and implement it, so I would propose to create a
> >separate
> >ticket and agree upon target version for it.
> >
> >Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?
> >
> >Thanks,
> >Sergey.
> >
> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
> ><[hidden email]>
> >wrote:
> >
> >> Here is what we should do:
> >>
> >>    1. Pick an acceptable number. Does not matter if it is 10% or 50%.
> >>    2. Print the allocated memory in *BOLD* letters into the log.
> >>    3. Make sure that Ignite server never hangs due to the low memory
> >issue.
> >>    We should sense it and kick the node out automatically, again with
> >a
> >> *BOLD*
> >>    message in the log.
> >>
> >>  Is this possible?
> >>
> >> D.
> >>
> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
> ><[hidden email]>
> >> wrote:
> >>
> >> > My proposal is 10% instead of 80%.
> >> >
> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
> >> >
> >> > > Vladimir, Dmitriy P.,
> >> > >
> >> > > Please see inline
> >> > >
> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
> ><[hidden email]>
> >> > > wrote:
> >> > > >
> >> > > > Denis,
> >> > > >
> >> > > > The reason is that product should not hang user's computer. How
> >else
> >> > this
> >> > > > could be explained? I am developer. I start Ignite, 1 node, 2
> >nodes,
> >> X
> >> > > > nodes, observe how they join topology. Add one key, 10 keys, 1M
> >keys.
> >> > > Then
> >> > > > I do a bug in example and load 100M keys accidentally - restart
> >the
> >> > > > computer. Correct behavior is to have small "maxMemory" by
> >default to
> >> > > avoid
> >> > > > that. User should get exception instead of hang. E.g. Java's
> >"-Xmx"
> >> is
> >> > > > typically 25% of RAM - more adequate value, comparing to
> >Ignite.
> >> > > >
> >> > >
> >> > > Right, the developer was educated about the Java heap parameters
> >and
> >> > > limited the overall space preferring OOM to the laptop
> >suspension. Who
> >> > > knows how he got to the point that 25% RAM should be used. That
> >might
> >> > have
> >> > > been deep knowledge about JVM or he faced several hangs while
> >testing
> >> the
> >> > > application.
> >> > >
> >> > > Anyway, JVM creators didn’t decide to predefine the Java heap to
> >a
> >> static
> >> > > value to avoid the situations like above. So should not we as a
> >> platform.
> >> > > Educate people about the Ignite memory behavior like Sun did for
> >the
> >> Java
> >> > > heap but do not try to solve the lack of knowledge with the
> >default
> >> > static
> >> > > memory size.
> >> > >
> >> > >
> >> > > > It doesn't matter whether you use persistence or not.
> >Persistent case
> >> > > just
> >> > > > makes this flaw more obvious - you have virtually unlimited
> >disk, and
> >> > yet
> >> > > > you end up with swapping and hang when using Ignite with
> >default
> >> > > > configuration. As already explained, the problem is not about
> >> > allocating
> >> > > > "maxMemory" right away, but about the value of "maxMemory" - it
> >is
> >> too
> >> > > big.
> >> > > >
> >> > >
> >> > > How do you know what should be the default then? Why 1 GB? For
> >> instance,
> >> > > if I end up having only 1 GB of free memory left and try to start
> >2
> >> > server
> >> > > nodes and an application I will face the laptop suspension again.
> >> > >
> >> > > —
> >> > > Denis
> >> > >
> >> > > > "We had this behavior before" is never an argument. Previous
> >offheap
> >> > > > implementation had a lot of flaws, so let's just forget about
> >it.
> >> > > >
> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda <[hidden email]>
> >> wrote:
> >> > > >
> >> > > >> Sergey,
> >> > > >>
> >> > > >> That’s expectable because as we revealed from this discussion
> >the
> >> > > >> allocation works different depending on whether the
> >persistence is
> >> > used
> >> > > or
> >> > > >> not:
> >> > > >>
> >> > > >> 1) In-memory mode (the persistence is disabled) - the space
> >will be
> >> > > >> allocated incrementally until the max threshold is reached.
> >Good!
> >> > > >>
> >> > > >> 2) The persistence mode - the whole space (limited by the max
> >> > threshold)
> >> > > >> is allocated right away. It’s not surprising that your laptop
> >starts
> >> > > >> choking.
> >> > > >>
> >> > > >> So, in my previous response I tried to explain that I can’t
> >find any
> >> > > >> reason why we should adjust 1). Any reasons except for the
> >massive
> >> > > >> preloading?
> >> > > >>
> >> > > >> As for 2), that was a big surprise to reveal this after 2.1
> >release.
> >> > > >> Definitely we have to fix this somehow.
> >> > > >>
> >> > > >> —
> >> > > >> Denis
> >> > > >>
> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
> >> > [hidden email]
> >> > > >
> >> > > >> wrote:
> >> > > >>>
> >> > > >>> Denis,
> >> > > >>>
> >> > > >>> Just a simple example from our own codebase: I tried to
> >execute
> >> > > >>> PersistentStoreExample with default settings and two server
> >nodes
> >> and
> >> > > >>> client node got frozen even on initial load of data into the
> >grid.
> >> > > >>> Although with one server node the example finishes pretty
> >quickly.
> >> > > >>>
> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
> >memory, but
> >> it
> >> > > >>> cannot deal with it.
> >> > > >>>
> >> > > >>>
> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
> ><[hidden email]>
> >> > wrote:
> >> > > >>>
> >> > > >>>>> As far as allocating 80% of available RAM - I was against
> >this
> >> even
> >> > > for
> >> > > >>>>> In-memory mode and still think that this is a wrong
> >default.
> >> > Looking
> >> > > at
> >> > > >>>>> free RAM is even worse because it gives you undefined
> >behavior.
> >> > > >>>>
> >> > > >>>> Guys, I can not understand how this dynamic memory
> >allocation's
> >> > > >> high-level
> >> > > >>>> behavior (with the persistence DISABLED) is different from
> >the
> >> > legacy
> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
> >allocate the
> >> > > >> space on
> >> > > >>>> demand, the current just does this more aggressively
> >requesting
> >> big
> >> > > >> chunks.
> >> > > >>>>
> >> > > >>>> Next, the legacy one was unlimited by default and the user
> >can
> >> start
> >> > > as
> >> > > >>>> many nodes as he wanted on a laptop and preload as much data
> >as he
> >> > > >> needed.
> >> > > >>>> Sure he could bring down the laptop if too many entries were
> >> > injected
> >> > > >> into
> >> > > >>>> the local cluster. But that’s about too massive preloading
> >and not
> >> > > >> caused
> >> > > >>>> by the ability of the legacy off-heap memory to grow
> >infinitely.
> >> The
> >> > > >> same
> >> > > >>>> preloading would cause a hang if the Java heap memory mode
> >is
> >> used.
> >> > > >>>>
> >> > > >>>> The upshot is that the massive preloading of data on the
> >local
> >> > laptop
> >> > > >>>> should not fixed with repealing of the dynamic memory
> >allocation.
> >> > > >>>> Is there any other reason why we have to use the static
> >memory
> >> > > >> allocation
> >> > > >>>> for the case when the persistence is disabled? I think the
> >case
> >> with
> >> > > the
> >> > > >>>> persistence should be reviewed separately.
> >> > > >>>>
> >> > > >>>> —
> >> > > >>>> Denis
> >> > > >>>>
> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> >> > > >>>> [hidden email]> wrote:
> >> > > >>>>>
> >> > > >>>>> Dmitriy,
> >> > > >>>>>
> >> > > >>>>> The reason behind this is the need to to be able to evict
> >and
> >> load
> >> > > >> pages
> >> > > >>>> to
> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping in
> >> memory.
> >> > > In
> >> > > >>>>> order to do this in the most efficient way, we need to know
> >in
> >> > > advance
> >> > > >>>> all
> >> > > >>>>> the address ranges we work with. We can add dynamic memory
> >> > extension
> >> > > >> for
> >> > > >>>>> persistence-enabled config, but this will add yet another
> >step of
> >> > > >>>>> indirection when resolving every page address, which adds a
> >> > > noticeable
> >> > > >>>>> performance penalty.
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
> >> > [hidden email]
> >> > > >:
> >> > > >>>>>
> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> >> > > [hidden email]
> >> > > >>>
> >> > > >>>>>> wrote:
> >> > > >>>>>>
> >> > > >>>>>>> Dima,
> >> > > >>>>>>>
> >> > > >>>>>>> Probably folks who worked closely with storage know why.
> >> > > >>>>>>>
> >> > > >>>>>>
> >> > > >>>>>> Without knowing why, how can we make a decision?
> >> > > >>>>>>
> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision about
> >not
> >> using
> >> > > >>>>>> increments? Do know remember what was the reason?
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>>>
> >> > > >>>>>>> The very problem is that before being started once on
> >> production
> >> > > >>>>>>> environment, Ignite will typically be started hundred
> >times on
> >> > > >>>>>> developer's
> >> > > >>>>>>> environment. I think that default should be ~10% of total
> >RAM.
> >> > > >>>>>>>
> >> > > >>>>>>
> >> > > >>>>>> Why not 80% of *free *RAM?
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>>>
> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
> >> > > >>>>>> [hidden email]>
> >> > > >>>>>>> wrote:
> >> > > >>>>>>>
> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> >> > > >> [hidden email]
> >> > > >>>>>
> >> > > >>>>>>>> wrote:
> >> > > >>>>>>>>
> >> > > >>>>>>>>> Please see original Sergey's message - when persistence
> >is
> >> > > enabled,
> >> > > >>>>>>>> memory
> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
> >> > > >>>>>>>>>
> >> > > >>>>>>>>
> >> > > >>>>>>>> Why?
> >> > > >>>>>>>>
> >> > > >>>>>>>>
> >> > > >>>>>>>>> Default settings must allow for normal work on
> >developer's
> >> > > >>>>>> environment.
> >> > > >>>>>>>>>
> >> > > >>>>>>>>
> >> > > >>>>>>>> Agree, but why not in increments?
> >> > > >>>>>>>>
> >> > > >>>>>>>>
> >> > > >>>>>>>>>
> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
> ><[hidden email]>:
> >> > > >>>>>>>>>
> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
> >> > > >>>>>>>>>>
> >> > > >>>>>>>>>> This is exactly how the allocation works right now.
> >The
> >> memory
> >> > > >> will
> >> > > >>>>>>>> grow
> >> > > >>>>>>>>>> incrementally until the max size is reached (80% of
> >RAM by
> >> > > >>>>>> default).
> >> > > >>>>>>>>>>
> >> > > >>>>>>>>>> —
> >> > > >>>>>>>>>> Denis
> >> > > >>>>>>>>>>
> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email]
> >wrote:
> >> > > >>>>>>>>>>>
> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and frankly i
> >do
> >> not
> >> > > want
> >> > > >>>>>>> t o
> >> > > >>>>>>>>>> guess. Why not allocate in increments automatically?
> >> > > >>>>>>>>>>>
> >> > > >>>>>>>>>>> ⁣D.​
> >> > > >>>>>>>>>>>
> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
> >Ozerov <
> >> > > >>>>>>>>>> [hidden email]> wrote:
> >> > > >>>>>>>>>>>> Denis,
> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1 with
> >> > > persistence,
> >> > > >>>>>>> when
> >> > > >>>>>>>>>>>> 80% of
> >> > > >>>>>>>>>>>> RAM is allocated right away, was released several
> >days
> >> ago.
> >> > > How
> >> > > >>>>>> do
> >> > > >>>>>>>> you
> >> > > >>>>>>>>>>>> think, how many users tried it already?
> >> > > >>>>>>>>>>>>
> >> > > >>>>>>>>>>>> Guys,
> >> > > >>>>>>>>>>>> Do you really think allocating 80% of available RAM
> >is a
> >> > > normal
> >> > > >>>>>>>> thing?
> >> > > >>>>>>>>>>>> Take
> >> > > >>>>>>>>>>>> your laptop and check how many available RAM you
> >have
> >> right
> >> > > now.
> >> > > >>>>>>> Do
> >> > > >>>>>>>>> you
> >> > > >>>>>>>>>>>> fit
> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
> >persistence
> >> > > with
> >> > > >>>>>>> all
> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
> >insane. We
> >> > > shold
> >> > > >>>>>>>>>>>> allocate no
> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it without
> >any
> >> > > >>>>>> problems.
> >> > > >>>>>>>>>>>>
> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> >> > > [hidden email]
> >> > > >>>>>>>
> >> > > >>>>>>>>> wrote:
> >> > > >>>>>>>>>>>>
> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think that
> >80% is
> >> > too
> >> > > >>>>>>>>>>>> aggressive
> >> > > >>>>>>>>>>>>> to bring it down.
> >> > > >>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the 80%
> >RAM
> >> > > >>>>>>> allocation
> >> > > >>>>>>>> on
> >> > > >>>>>>>>>>>> 64
> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit JVM.
> >I’ve
> >> > not
> >> > > >>>>>>> heard
> >> > > >>>>>>>> of
> >> > > >>>>>>>>>>>> any
> >> > > >>>>>>>>>>>>> other complaints in regards the default allocation
> >size.
> >> > > >>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>> —
> >> > > >>>>>>>>>>>>> Denis
> >> > > >>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM, [hidden email]
> >> wrote:
> >> > > >>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>> I prefer option #1.
> >> > > >>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>> ⁣D.​
> >> > > >>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey
> >Chugunov <
> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
> >> > > >>>>>>>>>>>>>>> Folks,
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> I would like to get back to the question about
> >> > MemoryPolicy
> >> > > >>>>>>>>>>>> maxMemory
> >> > > >>>>>>>>>>>>>>> defaults.
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with
> >initial
> >> and
> >> > > >>>>>>>> maxMemory
> >> > > >>>>>>>>>>>>>>> settings, when persistence is used MemoryPolicy
> >always
> >> > > >>>>>>> allocates
> >> > > >>>>>>>>>>>>>>> maxMemory
> >> > > >>>>>>>>>>>>>>> size for performance reasons.
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of physical
> >memory
> >> it
> >> > > >>>>>>> causes
> >> > > >>>>>>>>>>>> OOME
> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS or
> >JVM
> >> > level)
> >> > > >>>>>> and
> >> > > >>>>>>>>>>>> hurts
> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite nodes
> >are
> >> > > started
> >> > > >>>>>> on
> >> > > >>>>>>>>>>>> the
> >> > > >>>>>>>>>>>>>>> same
> >> > > >>>>>>>>>>>>>>> physical server.
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch to
> >other
> >> > > >>>>>>> options:
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and
> >adapt
> >> > > defaults.
> >> > > >>>>>>> In
> >> > > >>>>>>>>>>>> this
> >> > > >>>>>>>>>>>>>>> case we still need to address the issue with
> >multiple
> >> > nodes
> >> > > >>>>>> on
> >> > > >>>>>>>> one
> >> > > >>>>>>>>>>>>>>> machine
> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate, for
> >> > instance,
> >> > > >>>>>>>>>>>> max(0.3 *
> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with
> >starting
> >> > on
> >> > > 32
> >> > > >>>>>>> bit
> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple
> >nodes on
> >> > the
> >> > > >>>>>>> same
> >> > > >>>>>>>>>>>>>>> machine.
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
> >> > > >>>>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>>> Thanks,
> >> > > >>>>>>>>>>>>>>> Sergey.
> >> > > >>>>>>>>>>>>>
> >> > > >>>>>>>>>>>>>
> >> > > >>>>>>>>>>
> >> > > >>>>>>>>>>
> >> > > >>>>>>>>>
> >> > > >>>>>>>>
> >> > > >>>>>>>
> >> > > >>>>>>
> >> > > >>>>
> >> > > >>>>
> >> > > >>
> >> > > >>
> >> > >
> >> > >
> >> >
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

dsetrakyan
Hang on. I thought we were talking about offheap size, GC should not be relevant. Am I wrong?

⁣D.​

On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov <[hidden email]> wrote:

>Do you see an obvious way of implementing it?
>
>In java there is a heap and GC working on it. And for instance, it is
>possible to make a decision to throw an OOM based on some gc metrics.
>
>I may be wrong but I don't see a mechanism in Ignite to use it right
>away
>for such purposes.
>And implementing something without thorough planning brings huge risk
>of
>false positives with nodes stopping when they don't have to.
>
>That's why I think it must be implemented and intensively tested as
>part of
>a separate ticket.
>
>Thanks,
>Sergey.
>
>On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:
>
>> Without #3, the #1 and #2 make little sense.
>>
>> Why is #3 so difficult?
>>
>> ⁣D.​
>>
>> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
>> [hidden email]> wrote:
>> >Dmitriy,
>> >
>> >Last item makes perfect sense to me, one may think of it as an
>> >"OutOfMemoryException" in java.
>> >However, it looks like such feature requires considerable efforts to
>> >properly design and implement it, so I would propose to create a
>> >separate
>> >ticket and agree upon target version for it.
>> >
>> >Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?
>> >
>> >Thanks,
>> >Sergey.
>> >
>> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
>> ><[hidden email]>
>> >wrote:
>> >
>> >> Here is what we should do:
>> >>
>> >>    1. Pick an acceptable number. Does not matter if it is 10% or
>50%.
>> >>    2. Print the allocated memory in *BOLD* letters into the log.
>> >>    3. Make sure that Ignite server never hangs due to the low
>memory
>> >issue.
>> >>    We should sense it and kick the node out automatically, again
>with
>> >a
>> >> *BOLD*
>> >>    message in the log.
>> >>
>> >>  Is this possible?
>> >>
>> >> D.
>> >>
>> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
>> ><[hidden email]>
>> >> wrote:
>> >>
>> >> > My proposal is 10% instead of 80%.
>> >> >
>> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
>> >> >
>> >> > > Vladimir, Dmitriy P.,
>> >> > >
>> >> > > Please see inline
>> >> > >
>> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
>> ><[hidden email]>
>> >> > > wrote:
>> >> > > >
>> >> > > > Denis,
>> >> > > >
>> >> > > > The reason is that product should not hang user's computer.
>How
>> >else
>> >> > this
>> >> > > > could be explained? I am developer. I start Ignite, 1 node,
>2
>> >nodes,
>> >> X
>> >> > > > nodes, observe how they join topology. Add one key, 10 keys,
>1M
>> >keys.
>> >> > > Then
>> >> > > > I do a bug in example and load 100M keys accidentally -
>restart
>> >the
>> >> > > > computer. Correct behavior is to have small "maxMemory" by
>> >default to
>> >> > > avoid
>> >> > > > that. User should get exception instead of hang. E.g. Java's
>> >"-Xmx"
>> >> is
>> >> > > > typically 25% of RAM - more adequate value, comparing to
>> >Ignite.
>> >> > > >
>> >> > >
>> >> > > Right, the developer was educated about the Java heap
>parameters
>> >and
>> >> > > limited the overall space preferring OOM to the laptop
>> >suspension. Who
>> >> > > knows how he got to the point that 25% RAM should be used.
>That
>> >might
>> >> > have
>> >> > > been deep knowledge about JVM or he faced several hangs while
>> >testing
>> >> the
>> >> > > application.
>> >> > >
>> >> > > Anyway, JVM creators didn’t decide to predefine the Java heap
>to
>> >a
>> >> static
>> >> > > value to avoid the situations like above. So should not we as
>a
>> >> platform.
>> >> > > Educate people about the Ignite memory behavior like Sun did
>for
>> >the
>> >> Java
>> >> > > heap but do not try to solve the lack of knowledge with the
>> >default
>> >> > static
>> >> > > memory size.
>> >> > >
>> >> > >
>> >> > > > It doesn't matter whether you use persistence or not.
>> >Persistent case
>> >> > > just
>> >> > > > makes this flaw more obvious - you have virtually unlimited
>> >disk, and
>> >> > yet
>> >> > > > you end up with swapping and hang when using Ignite with
>> >default
>> >> > > > configuration. As already explained, the problem is not
>about
>> >> > allocating
>> >> > > > "maxMemory" right away, but about the value of "maxMemory" -
>it
>> >is
>> >> too
>> >> > > big.
>> >> > > >
>> >> > >
>> >> > > How do you know what should be the default then? Why 1 GB? For
>> >> instance,
>> >> > > if I end up having only 1 GB of free memory left and try to
>start
>> >2
>> >> > server
>> >> > > nodes and an application I will face the laptop suspension
>again.
>> >> > >
>> >> > > —
>> >> > > Denis
>> >> > >
>> >> > > > "We had this behavior before" is never an argument. Previous
>> >offheap
>> >> > > > implementation had a lot of flaws, so let's just forget
>about
>> >it.
>> >> > > >
>> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
><[hidden email]>
>> >> wrote:
>> >> > > >
>> >> > > >> Sergey,
>> >> > > >>
>> >> > > >> That’s expectable because as we revealed from this
>discussion
>> >the
>> >> > > >> allocation works different depending on whether the
>> >persistence is
>> >> > used
>> >> > > or
>> >> > > >> not:
>> >> > > >>
>> >> > > >> 1) In-memory mode (the persistence is disabled) - the space
>> >will be
>> >> > > >> allocated incrementally until the max threshold is reached.
>> >Good!
>> >> > > >>
>> >> > > >> 2) The persistence mode - the whole space (limited by the
>max
>> >> > threshold)
>> >> > > >> is allocated right away. It’s not surprising that your
>laptop
>> >starts
>> >> > > >> choking.
>> >> > > >>
>> >> > > >> So, in my previous response I tried to explain that I can’t
>> >find any
>> >> > > >> reason why we should adjust 1). Any reasons except for the
>> >massive
>> >> > > >> preloading?
>> >> > > >>
>> >> > > >> As for 2), that was a big surprise to reveal this after 2.1
>> >release.
>> >> > > >> Definitely we have to fix this somehow.
>> >> > > >>
>> >> > > >> —
>> >> > > >> Denis
>> >> > > >>
>> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
>> >> > [hidden email]
>> >> > > >
>> >> > > >> wrote:
>> >> > > >>>
>> >> > > >>> Denis,
>> >> > > >>>
>> >> > > >>> Just a simple example from our own codebase: I tried to
>> >execute
>> >> > > >>> PersistentStoreExample with default settings and two
>server
>> >nodes
>> >> and
>> >> > > >>> client node got frozen even on initial load of data into
>the
>> >grid.
>> >> > > >>> Although with one server node the example finishes pretty
>> >quickly.
>> >> > > >>>
>> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
>> >memory, but
>> >> it
>> >> > > >>> cannot deal with it.
>> >> > > >>>
>> >> > > >>>
>> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
>> ><[hidden email]>
>> >> > wrote:
>> >> > > >>>
>> >> > > >>>>> As far as allocating 80% of available RAM - I was
>against
>> >this
>> >> even
>> >> > > for
>> >> > > >>>>> In-memory mode and still think that this is a wrong
>> >default.
>> >> > Looking
>> >> > > at
>> >> > > >>>>> free RAM is even worse because it gives you undefined
>> >behavior.
>> >> > > >>>>
>> >> > > >>>> Guys, I can not understand how this dynamic memory
>> >allocation's
>> >> > > >> high-level
>> >> > > >>>> behavior (with the persistence DISABLED) is different
>from
>> >the
>> >> > legacy
>> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
>> >allocate the
>> >> > > >> space on
>> >> > > >>>> demand, the current just does this more aggressively
>> >requesting
>> >> big
>> >> > > >> chunks.
>> >> > > >>>>
>> >> > > >>>> Next, the legacy one was unlimited by default and the
>user
>> >can
>> >> start
>> >> > > as
>> >> > > >>>> many nodes as he wanted on a laptop and preload as much
>data
>> >as he
>> >> > > >> needed.
>> >> > > >>>> Sure he could bring down the laptop if too many entries
>were
>> >> > injected
>> >> > > >> into
>> >> > > >>>> the local cluster. But that’s about too massive
>preloading
>> >and not
>> >> > > >> caused
>> >> > > >>>> by the ability of the legacy off-heap memory to grow
>> >infinitely.
>> >> The
>> >> > > >> same
>> >> > > >>>> preloading would cause a hang if the Java heap memory
>mode
>> >is
>> >> used.
>> >> > > >>>>
>> >> > > >>>> The upshot is that the massive preloading of data on the
>> >local
>> >> > laptop
>> >> > > >>>> should not fixed with repealing of the dynamic memory
>> >allocation.
>> >> > > >>>> Is there any other reason why we have to use the static
>> >memory
>> >> > > >> allocation
>> >> > > >>>> for the case when the persistence is disabled? I think
>the
>> >case
>> >> with
>> >> > > the
>> >> > > >>>> persistence should be reviewed separately.
>> >> > > >>>>
>> >> > > >>>> —
>> >> > > >>>> Denis
>> >> > > >>>>
>> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>> >> > > >>>> [hidden email]> wrote:
>> >> > > >>>>>
>> >> > > >>>>> Dmitriy,
>> >> > > >>>>>
>> >> > > >>>>> The reason behind this is the need to to be able to
>evict
>> >and
>> >> load
>> >> > > >> pages
>> >> > > >>>> to
>> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping
>in
>> >> memory.
>> >> > > In
>> >> > > >>>>> order to do this in the most efficient way, we need to
>know
>> >in
>> >> > > advance
>> >> > > >>>> all
>> >> > > >>>>> the address ranges we work with. We can add dynamic
>memory
>> >> > extension
>> >> > > >> for
>> >> > > >>>>> persistence-enabled config, but this will add yet
>another
>> >step of
>> >> > > >>>>> indirection when resolving every page address, which
>adds a
>> >> > > noticeable
>> >> > > >>>>> performance penalty.
>> >> > > >>>>>
>> >> > > >>>>>
>> >> > > >>>>>
>> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
>> >> > [hidden email]
>> >> > > >:
>> >> > > >>>>>
>> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
>> >> > > [hidden email]
>> >> > > >>>
>> >> > > >>>>>> wrote:
>> >> > > >>>>>>
>> >> > > >>>>>>> Dima,
>> >> > > >>>>>>>
>> >> > > >>>>>>> Probably folks who worked closely with storage know
>why.
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>> Without knowing why, how can we make a decision?
>> >> > > >>>>>>
>> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
>about
>> >not
>> >> using
>> >> > > >>>>>> increments? Do know remember what was the reason?
>> >> > > >>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>> The very problem is that before being started once on
>> >> production
>> >> > > >>>>>>> environment, Ignite will typically be started hundred
>> >times on
>> >> > > >>>>>> developer's
>> >> > > >>>>>>> environment. I think that default should be ~10% of
>total
>> >RAM.
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>> Why not 80% of *free *RAM?
>> >> > > >>>>>>
>> >> > > >>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
>> >> > > >>>>>> [hidden email]>
>> >> > > >>>>>>> wrote:
>> >> > > >>>>>>>
>> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> >> > > >> [hidden email]
>> >> > > >>>>>
>> >> > > >>>>>>>> wrote:
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>> Please see original Sergey's message - when
>persistence
>> >is
>> >> > > enabled,
>> >> > > >>>>>>>> memory
>> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>> Why?
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>> Default settings must allow for normal work on
>> >developer's
>> >> > > >>>>>> environment.
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>> Agree, but why not in increments?
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
>> ><[hidden email]>:
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>> This is exactly how the allocation works right now.
>> >The
>> >> memory
>> >> > > >> will
>> >> > > >>>>>>>> grow
>> >> > > >>>>>>>>>> incrementally until the max size is reached (80% of
>> >RAM by
>> >> > > >>>>>> default).
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>> —
>> >> > > >>>>>>>>>> Denis
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email]
>> >wrote:
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
>frankly i
>> >do
>> >> not
>> >> > > want
>> >> > > >>>>>>> t o
>> >> > > >>>>>>>>>> guess. Why not allocate in increments
>automatically?
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> ⁣D.​
>> >> > > >>>>>>>>>>>
>> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
>> >Ozerov <
>> >> > > >>>>>>>>>> [hidden email]> wrote:
>> >> > > >>>>>>>>>>>> Denis,
>> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
>with
>> >> > > persistence,
>> >> > > >>>>>>> when
>> >> > > >>>>>>>>>>>> 80% of
>> >> > > >>>>>>>>>>>> RAM is allocated right away, was released several
>> >days
>> >> ago.
>> >> > > How
>> >> > > >>>>>> do
>> >> > > >>>>>>>> you
>> >> > > >>>>>>>>>>>> think, how many users tried it already?
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>> Guys,
>> >> > > >>>>>>>>>>>> Do you really think allocating 80% of available
>RAM
>> >is a
>> >> > > normal
>> >> > > >>>>>>>> thing?
>> >> > > >>>>>>>>>>>> Take
>> >> > > >>>>>>>>>>>> your laptop and check how many available RAM you
>> >have
>> >> right
>> >> > > now.
>> >> > > >>>>>>> Do
>> >> > > >>>>>>>>> you
>> >> > > >>>>>>>>>>>> fit
>> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
>> >persistence
>> >> > > with
>> >> > > >>>>>>> all
>> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
>> >insane. We
>> >> > > shold
>> >> > > >>>>>>>>>>>> allocate no
>> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
>without
>> >any
>> >> > > >>>>>> problems.
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
>> >> > > [hidden email]
>> >> > > >>>>>>>
>> >> > > >>>>>>>>> wrote:
>> >> > > >>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
>that
>> >80% is
>> >> > too
>> >> > > >>>>>>>>>>>> aggressive
>> >> > > >>>>>>>>>>>>> to bring it down.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the
>80%
>> >RAM
>> >> > > >>>>>>> allocation
>> >> > > >>>>>>>> on
>> >> > > >>>>>>>>>>>> 64
>> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit
>JVM.
>> >I’ve
>> >> > not
>> >> > > >>>>>>> heard
>> >> > > >>>>>>>> of
>> >> > > >>>>>>>>>>>> any
>> >> > > >>>>>>>>>>>>> other complaints in regards the default
>allocation
>> >size.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>> —
>> >> > > >>>>>>>>>>>>> Denis
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
>[hidden email]
>> >> wrote:
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> I prefer option #1.
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> ⁣D.​
>> >> > > >>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey
>> >Chugunov <
>> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
>> >> > > >>>>>>>>>>>>>>> Folks,
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> I would like to get back to the question about
>> >> > MemoryPolicy
>> >> > > >>>>>>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> defaults.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with
>> >initial
>> >> and
>> >> > > >>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
>MemoryPolicy
>> >always
>> >> > > >>>>>>> allocates
>> >> > > >>>>>>>>>>>>>>> maxMemory
>> >> > > >>>>>>>>>>>>>>> size for performance reasons.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
>physical
>> >memory
>> >> it
>> >> > > >>>>>>> causes
>> >> > > >>>>>>>>>>>> OOME
>> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS
>or
>> >JVM
>> >> > level)
>> >> > > >>>>>> and
>> >> > > >>>>>>>>>>>> hurts
>> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
>nodes
>> >are
>> >> > > started
>> >> > > >>>>>> on
>> >> > > >>>>>>>>>>>> the
>> >> > > >>>>>>>>>>>>>>> same
>> >> > > >>>>>>>>>>>>>>> physical server.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch
>to
>> >other
>> >> > > >>>>>>> options:
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and
>> >adapt
>> >> > > defaults.
>> >> > > >>>>>>> In
>> >> > > >>>>>>>>>>>> this
>> >> > > >>>>>>>>>>>>>>> case we still need to address the issue with
>> >multiple
>> >> > nodes
>> >> > > >>>>>> on
>> >> > > >>>>>>>> one
>> >> > > >>>>>>>>>>>>>>> machine
>> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate,
>for
>> >> > instance,
>> >> > > >>>>>>>>>>>> max(0.3 *
>> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
>> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with
>> >starting
>> >> > on
>> >> > > 32
>> >> > > >>>>>>> bit
>> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple
>> >nodes on
>> >> > the
>> >> > > >>>>>>> same
>> >> > > >>>>>>>>>>>>>>> machine.
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
>> >> > > >>>>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>>> Thanks,
>> >> > > >>>>>>>>>>>>>>> Sergey.
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>>>>
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>>
>> >> > > >>>>>>>>>
>> >> > > >>>>>>>>
>> >> > > >>>>>>>
>> >> > > >>>>>>
>> >> > > >>>>
>> >> > > >>>>
>> >> > > >>
>> >> > > >>
>> >> > >
>> >> > >
>> >> >
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Sergey Chugunov
I used GC and java only as an example, they are not applicable to Ignite
case where we manage offheap memory.

My point is that there is no easy way to implement this feature in Ignite,
and more time is needed to properly design it and account for all risks.

Thanks,
Sergey.

On Fri, Aug 4, 2017 at 12:44 PM, <[hidden email]> wrote:

> Hang on. I thought we were talking about offheap size, GC should not be
> relevant. Am I wrong?
>
> ⁣D.​
>
> On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov <
> [hidden email]> wrote:
> >Do you see an obvious way of implementing it?
> >
> >In java there is a heap and GC working on it. And for instance, it is
> >possible to make a decision to throw an OOM based on some gc metrics.
> >
> >I may be wrong but I don't see a mechanism in Ignite to use it right
> >away
> >for such purposes.
> >And implementing something without thorough planning brings huge risk
> >of
> >false positives with nodes stopping when they don't have to.
> >
> >That's why I think it must be implemented and intensively tested as
> >part of
> >a separate ticket.
> >
> >Thanks,
> >Sergey.
> >
> >On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:
> >
> >> Without #3, the #1 and #2 make little sense.
> >>
> >> Why is #3 so difficult?
> >>
> >> ⁣D.​
> >>
> >> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
> >> [hidden email]> wrote:
> >> >Dmitriy,
> >> >
> >> >Last item makes perfect sense to me, one may think of it as an
> >> >"OutOfMemoryException" in java.
> >> >However, it looks like such feature requires considerable efforts to
> >> >properly design and implement it, so I would propose to create a
> >> >separate
> >> >ticket and agree upon target version for it.
> >> >
> >> >Items #1 and #2 will be implemented under IGNITE-5717. Makes sense?
> >> >
> >> >Thanks,
> >> >Sergey.
> >> >
> >> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
> >> ><[hidden email]>
> >> >wrote:
> >> >
> >> >> Here is what we should do:
> >> >>
> >> >>    1. Pick an acceptable number. Does not matter if it is 10% or
> >50%.
> >> >>    2. Print the allocated memory in *BOLD* letters into the log.
> >> >>    3. Make sure that Ignite server never hangs due to the low
> >memory
> >> >issue.
> >> >>    We should sense it and kick the node out automatically, again
> >with
> >> >a
> >> >> *BOLD*
> >> >>    message in the log.
> >> >>
> >> >>  Is this possible?
> >> >>
> >> >> D.
> >> >>
> >> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
> >> ><[hidden email]>
> >> >> wrote:
> >> >>
> >> >> > My proposal is 10% instead of 80%.
> >> >> >
> >> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
> >> >> >
> >> >> > > Vladimir, Dmitriy P.,
> >> >> > >
> >> >> > > Please see inline
> >> >> > >
> >> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
> >> ><[hidden email]>
> >> >> > > wrote:
> >> >> > > >
> >> >> > > > Denis,
> >> >> > > >
> >> >> > > > The reason is that product should not hang user's computer.
> >How
> >> >else
> >> >> > this
> >> >> > > > could be explained? I am developer. I start Ignite, 1 node,
> >2
> >> >nodes,
> >> >> X
> >> >> > > > nodes, observe how they join topology. Add one key, 10 keys,
> >1M
> >> >keys.
> >> >> > > Then
> >> >> > > > I do a bug in example and load 100M keys accidentally -
> >restart
> >> >the
> >> >> > > > computer. Correct behavior is to have small "maxMemory" by
> >> >default to
> >> >> > > avoid
> >> >> > > > that. User should get exception instead of hang. E.g. Java's
> >> >"-Xmx"
> >> >> is
> >> >> > > > typically 25% of RAM - more adequate value, comparing to
> >> >Ignite.
> >> >> > > >
> >> >> > >
> >> >> > > Right, the developer was educated about the Java heap
> >parameters
> >> >and
> >> >> > > limited the overall space preferring OOM to the laptop
> >> >suspension. Who
> >> >> > > knows how he got to the point that 25% RAM should be used.
> >That
> >> >might
> >> >> > have
> >> >> > > been deep knowledge about JVM or he faced several hangs while
> >> >testing
> >> >> the
> >> >> > > application.
> >> >> > >
> >> >> > > Anyway, JVM creators didn’t decide to predefine the Java heap
> >to
> >> >a
> >> >> static
> >> >> > > value to avoid the situations like above. So should not we as
> >a
> >> >> platform.
> >> >> > > Educate people about the Ignite memory behavior like Sun did
> >for
> >> >the
> >> >> Java
> >> >> > > heap but do not try to solve the lack of knowledge with the
> >> >default
> >> >> > static
> >> >> > > memory size.
> >> >> > >
> >> >> > >
> >> >> > > > It doesn't matter whether you use persistence or not.
> >> >Persistent case
> >> >> > > just
> >> >> > > > makes this flaw more obvious - you have virtually unlimited
> >> >disk, and
> >> >> > yet
> >> >> > > > you end up with swapping and hang when using Ignite with
> >> >default
> >> >> > > > configuration. As already explained, the problem is not
> >about
> >> >> > allocating
> >> >> > > > "maxMemory" right away, but about the value of "maxMemory" -
> >it
> >> >is
> >> >> too
> >> >> > > big.
> >> >> > > >
> >> >> > >
> >> >> > > How do you know what should be the default then? Why 1 GB? For
> >> >> instance,
> >> >> > > if I end up having only 1 GB of free memory left and try to
> >start
> >> >2
> >> >> > server
> >> >> > > nodes and an application I will face the laptop suspension
> >again.
> >> >> > >
> >> >> > > —
> >> >> > > Denis
> >> >> > >
> >> >> > > > "We had this behavior before" is never an argument. Previous
> >> >offheap
> >> >> > > > implementation had a lot of flaws, so let's just forget
> >about
> >> >it.
> >> >> > > >
> >> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
> ><[hidden email]>
> >> >> wrote:
> >> >> > > >
> >> >> > > >> Sergey,
> >> >> > > >>
> >> >> > > >> That’s expectable because as we revealed from this
> >discussion
> >> >the
> >> >> > > >> allocation works different depending on whether the
> >> >persistence is
> >> >> > used
> >> >> > > or
> >> >> > > >> not:
> >> >> > > >>
> >> >> > > >> 1) In-memory mode (the persistence is disabled) - the space
> >> >will be
> >> >> > > >> allocated incrementally until the max threshold is reached.
> >> >Good!
> >> >> > > >>
> >> >> > > >> 2) The persistence mode - the whole space (limited by the
> >max
> >> >> > threshold)
> >> >> > > >> is allocated right away. It’s not surprising that your
> >laptop
> >> >starts
> >> >> > > >> choking.
> >> >> > > >>
> >> >> > > >> So, in my previous response I tried to explain that I can’t
> >> >find any
> >> >> > > >> reason why we should adjust 1). Any reasons except for the
> >> >massive
> >> >> > > >> preloading?
> >> >> > > >>
> >> >> > > >> As for 2), that was a big surprise to reveal this after 2.1
> >> >release.
> >> >> > > >> Definitely we have to fix this somehow.
> >> >> > > >>
> >> >> > > >> —
> >> >> > > >> Denis
> >> >> > > >>
> >> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
> >> >> > [hidden email]
> >> >> > > >
> >> >> > > >> wrote:
> >> >> > > >>>
> >> >> > > >>> Denis,
> >> >> > > >>>
> >> >> > > >>> Just a simple example from our own codebase: I tried to
> >> >execute
> >> >> > > >>> PersistentStoreExample with default settings and two
> >server
> >> >nodes
> >> >> and
> >> >> > > >>> client node got frozen even on initial load of data into
> >the
> >> >grid.
> >> >> > > >>> Although with one server node the example finishes pretty
> >> >quickly.
> >> >> > > >>>
> >> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
> >> >memory, but
> >> >> it
> >> >> > > >>> cannot deal with it.
> >> >> > > >>>
> >> >> > > >>>
> >> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
> >> ><[hidden email]>
> >> >> > wrote:
> >> >> > > >>>
> >> >> > > >>>>> As far as allocating 80% of available RAM - I was
> >against
> >> >this
> >> >> even
> >> >> > > for
> >> >> > > >>>>> In-memory mode and still think that this is a wrong
> >> >default.
> >> >> > Looking
> >> >> > > at
> >> >> > > >>>>> free RAM is even worse because it gives you undefined
> >> >behavior.
> >> >> > > >>>>
> >> >> > > >>>> Guys, I can not understand how this dynamic memory
> >> >allocation's
> >> >> > > >> high-level
> >> >> > > >>>> behavior (with the persistence DISABLED) is different
> >from
> >> >the
> >> >> > legacy
> >> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
> >> >allocate the
> >> >> > > >> space on
> >> >> > > >>>> demand, the current just does this more aggressively
> >> >requesting
> >> >> big
> >> >> > > >> chunks.
> >> >> > > >>>>
> >> >> > > >>>> Next, the legacy one was unlimited by default and the
> >user
> >> >can
> >> >> start
> >> >> > > as
> >> >> > > >>>> many nodes as he wanted on a laptop and preload as much
> >data
> >> >as he
> >> >> > > >> needed.
> >> >> > > >>>> Sure he could bring down the laptop if too many entries
> >were
> >> >> > injected
> >> >> > > >> into
> >> >> > > >>>> the local cluster. But that’s about too massive
> >preloading
> >> >and not
> >> >> > > >> caused
> >> >> > > >>>> by the ability of the legacy off-heap memory to grow
> >> >infinitely.
> >> >> The
> >> >> > > >> same
> >> >> > > >>>> preloading would cause a hang if the Java heap memory
> >mode
> >> >is
> >> >> used.
> >> >> > > >>>>
> >> >> > > >>>> The upshot is that the massive preloading of data on the
> >> >local
> >> >> > laptop
> >> >> > > >>>> should not fixed with repealing of the dynamic memory
> >> >allocation.
> >> >> > > >>>> Is there any other reason why we have to use the static
> >> >memory
> >> >> > > >> allocation
> >> >> > > >>>> for the case when the persistence is disabled? I think
> >the
> >> >case
> >> >> with
> >> >> > > the
> >> >> > > >>>> persistence should be reviewed separately.
> >> >> > > >>>>
> >> >> > > >>>> —
> >> >> > > >>>> Denis
> >> >> > > >>>>
> >> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> >> >> > > >>>> [hidden email]> wrote:
> >> >> > > >>>>>
> >> >> > > >>>>> Dmitriy,
> >> >> > > >>>>>
> >> >> > > >>>>> The reason behind this is the need to to be able to
> >evict
> >> >and
> >> >> load
> >> >> > > >> pages
> >> >> > > >>>> to
> >> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer mapping
> >in
> >> >> memory.
> >> >> > > In
> >> >> > > >>>>> order to do this in the most efficient way, we need to
> >know
> >> >in
> >> >> > > advance
> >> >> > > >>>> all
> >> >> > > >>>>> the address ranges we work with. We can add dynamic
> >memory
> >> >> > extension
> >> >> > > >> for
> >> >> > > >>>>> persistence-enabled config, but this will add yet
> >another
> >> >step of
> >> >> > > >>>>> indirection when resolving every page address, which
> >adds a
> >> >> > > noticeable
> >> >> > > >>>>> performance penalty.
> >> >> > > >>>>>
> >> >> > > >>>>>
> >> >> > > >>>>>
> >> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
> >> >> > [hidden email]
> >> >> > > >:
> >> >> > > >>>>>
> >> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> >> >> > > [hidden email]
> >> >> > > >>>
> >> >> > > >>>>>> wrote:
> >> >> > > >>>>>>
> >> >> > > >>>>>>> Dima,
> >> >> > > >>>>>>>
> >> >> > > >>>>>>> Probably folks who worked closely with storage know
> >why.
> >> >> > > >>>>>>>
> >> >> > > >>>>>>
> >> >> > > >>>>>> Without knowing why, how can we make a decision?
> >> >> > > >>>>>>
> >> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
> >about
> >> >not
> >> >> using
> >> >> > > >>>>>> increments? Do know remember what was the reason?
> >> >> > > >>>>>>
> >> >> > > >>>>>>
> >> >> > > >>>>>>>
> >> >> > > >>>>>>> The very problem is that before being started once on
> >> >> production
> >> >> > > >>>>>>> environment, Ignite will typically be started hundred
> >> >times on
> >> >> > > >>>>>> developer's
> >> >> > > >>>>>>> environment. I think that default should be ~10% of
> >total
> >> >RAM.
> >> >> > > >>>>>>>
> >> >> > > >>>>>>
> >> >> > > >>>>>> Why not 80% of *free *RAM?
> >> >> > > >>>>>>
> >> >> > > >>>>>>
> >> >> > > >>>>>>>
> >> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan <
> >> >> > > >>>>>> [hidden email]>
> >> >> > > >>>>>>> wrote:
> >> >> > > >>>>>>>
> >> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> >> >> > > >> [hidden email]
> >> >> > > >>>>>
> >> >> > > >>>>>>>> wrote:
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>>> Please see original Sergey's message - when
> >persistence
> >> >is
> >> >> > > enabled,
> >> >> > > >>>>>>>> memory
> >> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
> >> >> > > >>>>>>>>>
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>> Why?
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>>> Default settings must allow for normal work on
> >> >developer's
> >> >> > > >>>>>> environment.
> >> >> > > >>>>>>>>>
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>> Agree, but why not in increments?
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>>>
> >> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
> >> ><[hidden email]>:
> >> >> > > >>>>>>>>>
> >> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
> >> >> > > >>>>>>>>>>
> >> >> > > >>>>>>>>>> This is exactly how the allocation works right now.
> >> >The
> >> >> memory
> >> >> > > >> will
> >> >> > > >>>>>>>> grow
> >> >> > > >>>>>>>>>> incrementally until the max size is reached (80% of
> >> >RAM by
> >> >> > > >>>>>> default).
> >> >> > > >>>>>>>>>>
> >> >> > > >>>>>>>>>> —
> >> >> > > >>>>>>>>>> Denis
> >> >> > > >>>>>>>>>>
> >> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM, [hidden email]
> >> >wrote:
> >> >> > > >>>>>>>>>>>
> >> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
> >frankly i
> >> >do
> >> >> not
> >> >> > > want
> >> >> > > >>>>>>> t o
> >> >> > > >>>>>>>>>> guess. Why not allocate in increments
> >automatically?
> >> >> > > >>>>>>>>>>>
> >> >> > > >>>>>>>>>>> ⁣D.​
> >> >> > > >>>>>>>>>>>
> >> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
> >> >Ozerov <
> >> >> > > >>>>>>>>>> [hidden email]> wrote:
> >> >> > > >>>>>>>>>>>> Denis,
> >> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
> >with
> >> >> > > persistence,
> >> >> > > >>>>>>> when
> >> >> > > >>>>>>>>>>>> 80% of
> >> >> > > >>>>>>>>>>>> RAM is allocated right away, was released several
> >> >days
> >> >> ago.
> >> >> > > How
> >> >> > > >>>>>> do
> >> >> > > >>>>>>>> you
> >> >> > > >>>>>>>>>>>> think, how many users tried it already?
> >> >> > > >>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>> Guys,
> >> >> > > >>>>>>>>>>>> Do you really think allocating 80% of available
> >RAM
> >> >is a
> >> >> > > normal
> >> >> > > >>>>>>>> thing?
> >> >> > > >>>>>>>>>>>> Take
> >> >> > > >>>>>>>>>>>> your laptop and check how many available RAM you
> >> >have
> >> >> right
> >> >> > > now.
> >> >> > > >>>>>>> Do
> >> >> > > >>>>>>>>> you
> >> >> > > >>>>>>>>>>>> fit
> >> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
> >> >persistence
> >> >> > > with
> >> >> > > >>>>>>> all
> >> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
> >> >insane. We
> >> >> > > shold
> >> >> > > >>>>>>>>>>>> allocate no
> >> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
> >without
> >> >any
> >> >> > > >>>>>> problems.
> >> >> > > >>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> >> >> > > [hidden email]
> >> >> > > >>>>>>>
> >> >> > > >>>>>>>>> wrote:
> >> >> > > >>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
> >that
> >> >80% is
> >> >> > too
> >> >> > > >>>>>>>>>>>> aggressive
> >> >> > > >>>>>>>>>>>>> to bring it down.
> >> >> > > >>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of the
> >80%
> >> >RAM
> >> >> > > >>>>>>> allocation
> >> >> > > >>>>>>>> on
> >> >> > > >>>>>>>>>>>> 64
> >> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32 bit
> >JVM.
> >> >I’ve
> >> >> > not
> >> >> > > >>>>>>> heard
> >> >> > > >>>>>>>> of
> >> >> > > >>>>>>>>>>>> any
> >> >> > > >>>>>>>>>>>>> other complaints in regards the default
> >allocation
> >> >size.
> >> >> > > >>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>> —
> >> >> > > >>>>>>>>>>>>> Denis
> >> >> > > >>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
> >[hidden email]
> >> >> wrote:
> >> >> > > >>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>> I prefer option #1.
> >> >> > > >>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>> ⁣D.​
> >> >> > > >>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM, Sergey
> >> >Chugunov <
> >> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
> >> >> > > >>>>>>>>>>>>>>> Folks,
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> I would like to get back to the question about
> >> >> > MemoryPolicy
> >> >> > > >>>>>>>>>>>> maxMemory
> >> >> > > >>>>>>>>>>>>>>> defaults.
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured with
> >> >initial
> >> >> and
> >> >> > > >>>>>>>> maxMemory
> >> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
> >MemoryPolicy
> >> >always
> >> >> > > >>>>>>> allocates
> >> >> > > >>>>>>>>>>>>>>> maxMemory
> >> >> > > >>>>>>>>>>>>>>> size for performance reasons.
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
> >physical
> >> >memory
> >> >> it
> >> >> > > >>>>>>> causes
> >> >> > > >>>>>>>>>>>> OOME
> >> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on OS
> >or
> >> >JVM
> >> >> > level)
> >> >> > > >>>>>> and
> >> >> > > >>>>>>>>>>>> hurts
> >> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
> >nodes
> >> >are
> >> >> > > started
> >> >> > > >>>>>> on
> >> >> > > >>>>>>>>>>>> the
> >> >> > > >>>>>>>>>>>>>>> same
> >> >> > > >>>>>>>>>>>>>>> physical server.
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and switch
> >to
> >> >other
> >> >> > > >>>>>>> options:
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits and
> >> >adapt
> >> >> > > defaults.
> >> >> > > >>>>>>> In
> >> >> > > >>>>>>>>>>>> this
> >> >> > > >>>>>>>>>>>>>>> case we still need to address the issue with
> >> >multiple
> >> >> > nodes
> >> >> > > >>>>>> on
> >> >> > > >>>>>>>> one
> >> >> > > >>>>>>>>>>>>>>> machine
> >> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and allocate,
> >for
> >> >> > instance,
> >> >> > > >>>>>>>>>>>> max(0.3 *
> >> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
> >> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues with
> >> >starting
> >> >> > on
> >> >> > > 32
> >> >> > > >>>>>>> bit
> >> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with multiple
> >> >nodes on
> >> >> > the
> >> >> > > >>>>>>> same
> >> >> > > >>>>>>>>>>>>>>> machine.
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
> >> >> > > >>>>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>>> Thanks,
> >> >> > > >>>>>>>>>>>>>>> Sergey.
> >> >> > > >>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>>>>
> >> >> > > >>>>>>>>>>
> >> >> > > >>>>>>>>>>
> >> >> > > >>>>>>>>>
> >> >> > > >>>>>>>>
> >> >> > > >>>>>>>
> >> >> > > >>>>>>
> >> >> > > >>>>
> >> >> > > >>>>
> >> >> > > >>
> >> >> > > >>
> >> >> > >
> >> >> > >
> >> >> >
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

dsetrakyan
But why? We allocate the memory, so we should know when it runs out. What am i missing?

⁣D.​

On Aug 4, 2017, 11:55 AM, at 11:55 AM, Sergey Chugunov <[hidden email]> wrote:

>I used GC and java only as an example, they are not applicable to
>Ignite
>case where we manage offheap memory.
>
>My point is that there is no easy way to implement this feature in
>Ignite,
>and more time is needed to properly design it and account for all
>risks.
>
>Thanks,
>Sergey.
>
>On Fri, Aug 4, 2017 at 12:44 PM, <[hidden email]> wrote:
>
>> Hang on. I thought we were talking about offheap size, GC should not
>be
>> relevant. Am I wrong?
>>
>> ⁣D.​
>>
>> On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov <
>> [hidden email]> wrote:
>> >Do you see an obvious way of implementing it?
>> >
>> >In java there is a heap and GC working on it. And for instance, it
>is
>> >possible to make a decision to throw an OOM based on some gc
>metrics.
>> >
>> >I may be wrong but I don't see a mechanism in Ignite to use it right
>> >away
>> >for such purposes.
>> >And implementing something without thorough planning brings huge
>risk
>> >of
>> >false positives with nodes stopping when they don't have to.
>> >
>> >That's why I think it must be implemented and intensively tested as
>> >part of
>> >a separate ticket.
>> >
>> >Thanks,
>> >Sergey.
>> >
>> >On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:
>> >
>> >> Without #3, the #1 and #2 make little sense.
>> >>
>> >> Why is #3 so difficult?
>> >>
>> >> ⁣D.​
>> >>
>> >> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
>> >> [hidden email]> wrote:
>> >> >Dmitriy,
>> >> >
>> >> >Last item makes perfect sense to me, one may think of it as an
>> >> >"OutOfMemoryException" in java.
>> >> >However, it looks like such feature requires considerable efforts
>to
>> >> >properly design and implement it, so I would propose to create a
>> >> >separate
>> >> >ticket and agree upon target version for it.
>> >> >
>> >> >Items #1 and #2 will be implemented under IGNITE-5717. Makes
>sense?
>> >> >
>> >> >Thanks,
>> >> >Sergey.
>> >> >
>> >> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
>> >> ><[hidden email]>
>> >> >wrote:
>> >> >
>> >> >> Here is what we should do:
>> >> >>
>> >> >>    1. Pick an acceptable number. Does not matter if it is 10%
>or
>> >50%.
>> >> >>    2. Print the allocated memory in *BOLD* letters into the
>log.
>> >> >>    3. Make sure that Ignite server never hangs due to the low
>> >memory
>> >> >issue.
>> >> >>    We should sense it and kick the node out automatically,
>again
>> >with
>> >> >a
>> >> >> *BOLD*
>> >> >>    message in the log.
>> >> >>
>> >> >>  Is this possible?
>> >> >>
>> >> >> D.
>> >> >>
>> >> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
>> >> ><[hidden email]>
>> >> >> wrote:
>> >> >>
>> >> >> > My proposal is 10% instead of 80%.
>> >> >> >
>> >> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
>> >> >> >
>> >> >> > > Vladimir, Dmitriy P.,
>> >> >> > >
>> >> >> > > Please see inline
>> >> >> > >
>> >> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
>> >> ><[hidden email]>
>> >> >> > > wrote:
>> >> >> > > >
>> >> >> > > > Denis,
>> >> >> > > >
>> >> >> > > > The reason is that product should not hang user's
>computer.
>> >How
>> >> >else
>> >> >> > this
>> >> >> > > > could be explained? I am developer. I start Ignite, 1
>node,
>> >2
>> >> >nodes,
>> >> >> X
>> >> >> > > > nodes, observe how they join topology. Add one key, 10
>keys,
>> >1M
>> >> >keys.
>> >> >> > > Then
>> >> >> > > > I do a bug in example and load 100M keys accidentally -
>> >restart
>> >> >the
>> >> >> > > > computer. Correct behavior is to have small "maxMemory"
>by
>> >> >default to
>> >> >> > > avoid
>> >> >> > > > that. User should get exception instead of hang. E.g.
>Java's
>> >> >"-Xmx"
>> >> >> is
>> >> >> > > > typically 25% of RAM - more adequate value, comparing to
>> >> >Ignite.
>> >> >> > > >
>> >> >> > >
>> >> >> > > Right, the developer was educated about the Java heap
>> >parameters
>> >> >and
>> >> >> > > limited the overall space preferring OOM to the laptop
>> >> >suspension. Who
>> >> >> > > knows how he got to the point that 25% RAM should be used.
>> >That
>> >> >might
>> >> >> > have
>> >> >> > > been deep knowledge about JVM or he faced several hangs
>while
>> >> >testing
>> >> >> the
>> >> >> > > application.
>> >> >> > >
>> >> >> > > Anyway, JVM creators didn’t decide to predefine the Java
>heap
>> >to
>> >> >a
>> >> >> static
>> >> >> > > value to avoid the situations like above. So should not we
>as
>> >a
>> >> >> platform.
>> >> >> > > Educate people about the Ignite memory behavior like Sun
>did
>> >for
>> >> >the
>> >> >> Java
>> >> >> > > heap but do not try to solve the lack of knowledge with the
>> >> >default
>> >> >> > static
>> >> >> > > memory size.
>> >> >> > >
>> >> >> > >
>> >> >> > > > It doesn't matter whether you use persistence or not.
>> >> >Persistent case
>> >> >> > > just
>> >> >> > > > makes this flaw more obvious - you have virtually
>unlimited
>> >> >disk, and
>> >> >> > yet
>> >> >> > > > you end up with swapping and hang when using Ignite with
>> >> >default
>> >> >> > > > configuration. As already explained, the problem is not
>> >about
>> >> >> > allocating
>> >> >> > > > "maxMemory" right away, but about the value of
>"maxMemory" -
>> >it
>> >> >is
>> >> >> too
>> >> >> > > big.
>> >> >> > > >
>> >> >> > >
>> >> >> > > How do you know what should be the default then? Why 1 GB?
>For
>> >> >> instance,
>> >> >> > > if I end up having only 1 GB of free memory left and try to
>> >start
>> >> >2
>> >> >> > server
>> >> >> > > nodes and an application I will face the laptop suspension
>> >again.
>> >> >> > >
>> >> >> > > —
>> >> >> > > Denis
>> >> >> > >
>> >> >> > > > "We had this behavior before" is never an argument.
>Previous
>> >> >offheap
>> >> >> > > > implementation had a lot of flaws, so let's just forget
>> >about
>> >> >it.
>> >> >> > > >
>> >> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
>> ><[hidden email]>
>> >> >> wrote:
>> >> >> > > >
>> >> >> > > >> Sergey,
>> >> >> > > >>
>> >> >> > > >> That’s expectable because as we revealed from this
>> >discussion
>> >> >the
>> >> >> > > >> allocation works different depending on whether the
>> >> >persistence is
>> >> >> > used
>> >> >> > > or
>> >> >> > > >> not:
>> >> >> > > >>
>> >> >> > > >> 1) In-memory mode (the persistence is disabled) - the
>space
>> >> >will be
>> >> >> > > >> allocated incrementally until the max threshold is
>reached.
>> >> >Good!
>> >> >> > > >>
>> >> >> > > >> 2) The persistence mode - the whole space (limited by
>the
>> >max
>> >> >> > threshold)
>> >> >> > > >> is allocated right away. It’s not surprising that your
>> >laptop
>> >> >starts
>> >> >> > > >> choking.
>> >> >> > > >>
>> >> >> > > >> So, in my previous response I tried to explain that I
>can’t
>> >> >find any
>> >> >> > > >> reason why we should adjust 1). Any reasons except for
>the
>> >> >massive
>> >> >> > > >> preloading?
>> >> >> > > >>
>> >> >> > > >> As for 2), that was a big surprise to reveal this after
>2.1
>> >> >release.
>> >> >> > > >> Definitely we have to fix this somehow.
>> >> >> > > >>
>> >> >> > > >> —
>> >> >> > > >> Denis
>> >> >> > > >>
>> >> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
>> >> >> > [hidden email]
>> >> >> > > >
>> >> >> > > >> wrote:
>> >> >> > > >>>
>> >> >> > > >>> Denis,
>> >> >> > > >>>
>> >> >> > > >>> Just a simple example from our own codebase: I tried to
>> >> >execute
>> >> >> > > >>> PersistentStoreExample with default settings and two
>> >server
>> >> >nodes
>> >> >> and
>> >> >> > > >>> client node got frozen even on initial load of data
>into
>> >the
>> >> >grid.
>> >> >> > > >>> Although with one server node the example finishes
>pretty
>> >> >quickly.
>> >> >> > > >>>
>> >> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
>> >> >memory, but
>> >> >> it
>> >> >> > > >>> cannot deal with it.
>> >> >> > > >>>
>> >> >> > > >>>
>> >> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
>> >> ><[hidden email]>
>> >> >> > wrote:
>> >> >> > > >>>
>> >> >> > > >>>>> As far as allocating 80% of available RAM - I was
>> >against
>> >> >this
>> >> >> even
>> >> >> > > for
>> >> >> > > >>>>> In-memory mode and still think that this is a wrong
>> >> >default.
>> >> >> > Looking
>> >> >> > > at
>> >> >> > > >>>>> free RAM is even worse because it gives you undefined
>> >> >behavior.
>> >> >> > > >>>>
>> >> >> > > >>>> Guys, I can not understand how this dynamic memory
>> >> >allocation's
>> >> >> > > >> high-level
>> >> >> > > >>>> behavior (with the persistence DISABLED) is different
>> >from
>> >> >the
>> >> >> > legacy
>> >> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
>> >> >allocate the
>> >> >> > > >> space on
>> >> >> > > >>>> demand, the current just does this more aggressively
>> >> >requesting
>> >> >> big
>> >> >> > > >> chunks.
>> >> >> > > >>>>
>> >> >> > > >>>> Next, the legacy one was unlimited by default and the
>> >user
>> >> >can
>> >> >> start
>> >> >> > > as
>> >> >> > > >>>> many nodes as he wanted on a laptop and preload as
>much
>> >data
>> >> >as he
>> >> >> > > >> needed.
>> >> >> > > >>>> Sure he could bring down the laptop if too many
>entries
>> >were
>> >> >> > injected
>> >> >> > > >> into
>> >> >> > > >>>> the local cluster. But that’s about too massive
>> >preloading
>> >> >and not
>> >> >> > > >> caused
>> >> >> > > >>>> by the ability of the legacy off-heap memory to grow
>> >> >infinitely.
>> >> >> The
>> >> >> > > >> same
>> >> >> > > >>>> preloading would cause a hang if the Java heap memory
>> >mode
>> >> >is
>> >> >> used.
>> >> >> > > >>>>
>> >> >> > > >>>> The upshot is that the massive preloading of data on
>the
>> >> >local
>> >> >> > laptop
>> >> >> > > >>>> should not fixed with repealing of the dynamic memory
>> >> >allocation.
>> >> >> > > >>>> Is there any other reason why we have to use the
>static
>> >> >memory
>> >> >> > > >> allocation
>> >> >> > > >>>> for the case when the persistence is disabled? I think
>> >the
>> >> >case
>> >> >> with
>> >> >> > > the
>> >> >> > > >>>> persistence should be reviewed separately.
>> >> >> > > >>>>
>> >> >> > > >>>> —
>> >> >> > > >>>> Denis
>> >> >> > > >>>>
>> >> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>> >> >> > > >>>> [hidden email]> wrote:
>> >> >> > > >>>>>
>> >> >> > > >>>>> Dmitriy,
>> >> >> > > >>>>>
>> >> >> > > >>>>> The reason behind this is the need to to be able to
>> >evict
>> >> >and
>> >> >> load
>> >> >> > > >> pages
>> >> >> > > >>>> to
>> >> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer
>mapping
>> >in
>> >> >> memory.
>> >> >> > > In
>> >> >> > > >>>>> order to do this in the most efficient way, we need
>to
>> >know
>> >> >in
>> >> >> > > advance
>> >> >> > > >>>> all
>> >> >> > > >>>>> the address ranges we work with. We can add dynamic
>> >memory
>> >> >> > extension
>> >> >> > > >> for
>> >> >> > > >>>>> persistence-enabled config, but this will add yet
>> >another
>> >> >step of
>> >> >> > > >>>>> indirection when resolving every page address, which
>> >adds a
>> >> >> > > noticeable
>> >> >> > > >>>>> performance penalty.
>> >> >> > > >>>>>
>> >> >> > > >>>>>
>> >> >> > > >>>>>
>> >> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
>> >> >> > [hidden email]
>> >> >> > > >:
>> >> >> > > >>>>>
>> >> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
>> >> >> > > [hidden email]
>> >> >> > > >>>
>> >> >> > > >>>>>> wrote:
>> >> >> > > >>>>>>
>> >> >> > > >>>>>>> Dima,
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>> Probably folks who worked closely with storage know
>> >why.
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>
>> >> >> > > >>>>>> Without knowing why, how can we make a decision?
>> >> >> > > >>>>>>
>> >> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
>> >about
>> >> >not
>> >> >> using
>> >> >> > > >>>>>> increments? Do know remember what was the reason?
>> >> >> > > >>>>>>
>> >> >> > > >>>>>>
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>> The very problem is that before being started once
>on
>> >> >> production
>> >> >> > > >>>>>>> environment, Ignite will typically be started
>hundred
>> >> >times on
>> >> >> > > >>>>>> developer's
>> >> >> > > >>>>>>> environment. I think that default should be ~10% of
>> >total
>> >> >RAM.
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>
>> >> >> > > >>>>>> Why not 80% of *free *RAM?
>> >> >> > > >>>>>>
>> >> >> > > >>>>>>
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan
><
>> >> >> > > >>>>>> [hidden email]>
>> >> >> > > >>>>>>> wrote:
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> >> >> > > >> [hidden email]
>> >> >> > > >>>>>
>> >> >> > > >>>>>>>> wrote:
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>>> Please see original Sergey's message - when
>> >persistence
>> >> >is
>> >> >> > > enabled,
>> >> >> > > >>>>>>>> memory
>> >> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
>> >> >> > > >>>>>>>>>
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>> Why?
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>>> Default settings must allow for normal work on
>> >> >developer's
>> >> >> > > >>>>>> environment.
>> >> >> > > >>>>>>>>>
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>> Agree, but why not in increments?
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>>>
>> >> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
>> >> ><[hidden email]>:
>> >> >> > > >>>>>>>>>
>> >> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
>> >> >> > > >>>>>>>>>>
>> >> >> > > >>>>>>>>>> This is exactly how the allocation works right
>now.
>> >> >The
>> >> >> memory
>> >> >> > > >> will
>> >> >> > > >>>>>>>> grow
>> >> >> > > >>>>>>>>>> incrementally until the max size is reached (80%
>of
>> >> >RAM by
>> >> >> > > >>>>>> default).
>> >> >> > > >>>>>>>>>>
>> >> >> > > >>>>>>>>>> —
>> >> >> > > >>>>>>>>>> Denis
>> >> >> > > >>>>>>>>>>
>> >> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM,
>[hidden email]
>> >> >wrote:
>> >> >> > > >>>>>>>>>>>
>> >> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
>> >frankly i
>> >> >do
>> >> >> not
>> >> >> > > want
>> >> >> > > >>>>>>> t o
>> >> >> > > >>>>>>>>>> guess. Why not allocate in increments
>> >automatically?
>> >> >> > > >>>>>>>>>>>
>> >> >> > > >>>>>>>>>>> ⁣D.​
>> >> >> > > >>>>>>>>>>>
>> >> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
>> >> >Ozerov <
>> >> >> > > >>>>>>>>>> [hidden email]> wrote:
>> >> >> > > >>>>>>>>>>>> Denis,
>> >> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
>> >with
>> >> >> > > persistence,
>> >> >> > > >>>>>>> when
>> >> >> > > >>>>>>>>>>>> 80% of
>> >> >> > > >>>>>>>>>>>> RAM is allocated right away, was released
>several
>> >> >days
>> >> >> ago.
>> >> >> > > How
>> >> >> > > >>>>>> do
>> >> >> > > >>>>>>>> you
>> >> >> > > >>>>>>>>>>>> think, how many users tried it already?
>> >> >> > > >>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>> Guys,
>> >> >> > > >>>>>>>>>>>> Do you really think allocating 80% of
>available
>> >RAM
>> >> >is a
>> >> >> > > normal
>> >> >> > > >>>>>>>> thing?
>> >> >> > > >>>>>>>>>>>> Take
>> >> >> > > >>>>>>>>>>>> your laptop and check how many available RAM
>you
>> >> >have
>> >> >> right
>> >> >> > > now.
>> >> >> > > >>>>>>> Do
>> >> >> > > >>>>>>>>> you
>> >> >> > > >>>>>>>>>>>> fit
>> >> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
>> >> >persistence
>> >> >> > > with
>> >> >> > > >>>>>>> all
>> >> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
>> >> >insane. We
>> >> >> > > shold
>> >> >> > > >>>>>>>>>>>> allocate no
>> >> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
>> >without
>> >> >any
>> >> >> > > >>>>>> problems.
>> >> >> > > >>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
>> >> >> > > [hidden email]
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>>>> wrote:
>> >> >> > > >>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
>> >that
>> >> >80% is
>> >> >> > too
>> >> >> > > >>>>>>>>>>>> aggressive
>> >> >> > > >>>>>>>>>>>>> to bring it down.
>> >> >> > > >>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of
>the
>> >80%
>> >> >RAM
>> >> >> > > >>>>>>> allocation
>> >> >> > > >>>>>>>> on
>> >> >> > > >>>>>>>>>>>> 64
>> >> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32
>bit
>> >JVM.
>> >> >I’ve
>> >> >> > not
>> >> >> > > >>>>>>> heard
>> >> >> > > >>>>>>>> of
>> >> >> > > >>>>>>>>>>>> any
>> >> >> > > >>>>>>>>>>>>> other complaints in regards the default
>> >allocation
>> >> >size.
>> >> >> > > >>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>> —
>> >> >> > > >>>>>>>>>>>>> Denis
>> >> >> > > >>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
>> >[hidden email]
>> >> >> wrote:
>> >> >> > > >>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>> I prefer option #1.
>> >> >> > > >>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>> ⁣D.​
>> >> >> > > >>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM,
>Sergey
>> >> >Chugunov <
>> >> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
>> >> >> > > >>>>>>>>>>>>>>> Folks,
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> I would like to get back to the question
>about
>> >> >> > MemoryPolicy
>> >> >> > > >>>>>>>>>>>> maxMemory
>> >> >> > > >>>>>>>>>>>>>>> defaults.
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured
>with
>> >> >initial
>> >> >> and
>> >> >> > > >>>>>>>> maxMemory
>> >> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
>> >MemoryPolicy
>> >> >always
>> >> >> > > >>>>>>> allocates
>> >> >> > > >>>>>>>>>>>>>>> maxMemory
>> >> >> > > >>>>>>>>>>>>>>> size for performance reasons.
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
>> >physical
>> >> >memory
>> >> >> it
>> >> >> > > >>>>>>> causes
>> >> >> > > >>>>>>>>>>>> OOME
>> >> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on
>OS
>> >or
>> >> >JVM
>> >> >> > level)
>> >> >> > > >>>>>> and
>> >> >> > > >>>>>>>>>>>> hurts
>> >> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
>> >nodes
>> >> >are
>> >> >> > > started
>> >> >> > > >>>>>> on
>> >> >> > > >>>>>>>>>>>> the
>> >> >> > > >>>>>>>>>>>>>>> same
>> >> >> > > >>>>>>>>>>>>>>> physical server.
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and
>switch
>> >to
>> >> >other
>> >> >> > > >>>>>>> options:
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits
>and
>> >> >adapt
>> >> >> > > defaults.
>> >> >> > > >>>>>>> In
>> >> >> > > >>>>>>>>>>>> this
>> >> >> > > >>>>>>>>>>>>>>> case we still need to address the issue
>with
>> >> >multiple
>> >> >> > nodes
>> >> >> > > >>>>>> on
>> >> >> > > >>>>>>>> one
>> >> >> > > >>>>>>>>>>>>>>> machine
>> >> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and
>allocate,
>> >for
>> >> >> > instance,
>> >> >> > > >>>>>>>>>>>> max(0.3 *
>> >> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
>> >> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues
>with
>> >> >starting
>> >> >> > on
>> >> >> > > 32
>> >> >> > > >>>>>>> bit
>> >> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with
>multiple
>> >> >nodes on
>> >> >> > the
>> >> >> > > >>>>>>> same
>> >> >> > > >>>>>>>>>>>>>>> machine.
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
>> >> >> > > >>>>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>>> Thanks,
>> >> >> > > >>>>>>>>>>>>>>> Sergey.
>> >> >> > > >>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>>>>
>> >> >> > > >>>>>>>>>>
>> >> >> > > >>>>>>>>>>
>> >> >> > > >>>>>>>>>
>> >> >> > > >>>>>>>>
>> >> >> > > >>>>>>>
>> >> >> > > >>>>>>
>> >> >> > > >>>>
>> >> >> > > >>>>
>> >> >> > > >>
>> >> >> > > >>
>> >> >> > >
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Sergey Chugunov
Dmitriy,

When Ignite node "allocates memory" it actually just reserves a chunk in
its address space, almost no physical RAM is used.

I can easily start half a dozen of ignite nodes with current defaults on my
laptop with only 16 Gigs of RAM; and each node will "allocate" around 12
Gigs; 72 gigabytes in total.
The laptop will do easily with it so far I don't stream any data to the
grid.

But when I put some pressure to the grid, massive swapping of memory pages
will show up as OS begins trying to keep a huge amount of pages of
different processes in memory.

So indicator "we are running out of memory" just doesn't work here.

Thanks,
Sergey.

On Fri, Aug 4, 2017 at 1:01 PM, <[hidden email]> wrote:

> But why? We allocate the memory, so we should know when it runs out. What
> am i missing?
>
> ⁣D.​
>
> On Aug 4, 2017, 11:55 AM, at 11:55 AM, Sergey Chugunov <
> [hidden email]> wrote:
> >I used GC and java only as an example, they are not applicable to
> >Ignite
> >case where we manage offheap memory.
> >
> >My point is that there is no easy way to implement this feature in
> >Ignite,
> >and more time is needed to properly design it and account for all
> >risks.
> >
> >Thanks,
> >Sergey.
> >
> >On Fri, Aug 4, 2017 at 12:44 PM, <[hidden email]> wrote:
> >
> >> Hang on. I thought we were talking about offheap size, GC should not
> >be
> >> relevant. Am I wrong?
> >>
> >> ⁣D.​
> >>
> >> On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov <
> >> [hidden email]> wrote:
> >> >Do you see an obvious way of implementing it?
> >> >
> >> >In java there is a heap and GC working on it. And for instance, it
> >is
> >> >possible to make a decision to throw an OOM based on some gc
> >metrics.
> >> >
> >> >I may be wrong but I don't see a mechanism in Ignite to use it right
> >> >away
> >> >for such purposes.
> >> >And implementing something without thorough planning brings huge
> >risk
> >> >of
> >> >false positives with nodes stopping when they don't have to.
> >> >
> >> >That's why I think it must be implemented and intensively tested as
> >> >part of
> >> >a separate ticket.
> >> >
> >> >Thanks,
> >> >Sergey.
> >> >
> >> >On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:
> >> >
> >> >> Without #3, the #1 and #2 make little sense.
> >> >>
> >> >> Why is #3 so difficult?
> >> >>
> >> >> ⁣D.​
> >> >>
> >> >> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
> >> >> [hidden email]> wrote:
> >> >> >Dmitriy,
> >> >> >
> >> >> >Last item makes perfect sense to me, one may think of it as an
> >> >> >"OutOfMemoryException" in java.
> >> >> >However, it looks like such feature requires considerable efforts
> >to
> >> >> >properly design and implement it, so I would propose to create a
> >> >> >separate
> >> >> >ticket and agree upon target version for it.
> >> >> >
> >> >> >Items #1 and #2 will be implemented under IGNITE-5717. Makes
> >sense?
> >> >> >
> >> >> >Thanks,
> >> >> >Sergey.
> >> >> >
> >> >> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
> >> >> ><[hidden email]>
> >> >> >wrote:
> >> >> >
> >> >> >> Here is what we should do:
> >> >> >>
> >> >> >>    1. Pick an acceptable number. Does not matter if it is 10%
> >or
> >> >50%.
> >> >> >>    2. Print the allocated memory in *BOLD* letters into the
> >log.
> >> >> >>    3. Make sure that Ignite server never hangs due to the low
> >> >memory
> >> >> >issue.
> >> >> >>    We should sense it and kick the node out automatically,
> >again
> >> >with
> >> >> >a
> >> >> >> *BOLD*
> >> >> >>    message in the log.
> >> >> >>
> >> >> >>  Is this possible?
> >> >> >>
> >> >> >> D.
> >> >> >>
> >> >> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
> >> >> ><[hidden email]>
> >> >> >> wrote:
> >> >> >>
> >> >> >> > My proposal is 10% instead of 80%.
> >> >> >> >
> >> >> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
> >> >> >> >
> >> >> >> > > Vladimir, Dmitriy P.,
> >> >> >> > >
> >> >> >> > > Please see inline
> >> >> >> > >
> >> >> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
> >> >> ><[hidden email]>
> >> >> >> > > wrote:
> >> >> >> > > >
> >> >> >> > > > Denis,
> >> >> >> > > >
> >> >> >> > > > The reason is that product should not hang user's
> >computer.
> >> >How
> >> >> >else
> >> >> >> > this
> >> >> >> > > > could be explained? I am developer. I start Ignite, 1
> >node,
> >> >2
> >> >> >nodes,
> >> >> >> X
> >> >> >> > > > nodes, observe how they join topology. Add one key, 10
> >keys,
> >> >1M
> >> >> >keys.
> >> >> >> > > Then
> >> >> >> > > > I do a bug in example and load 100M keys accidentally -
> >> >restart
> >> >> >the
> >> >> >> > > > computer. Correct behavior is to have small "maxMemory"
> >by
> >> >> >default to
> >> >> >> > > avoid
> >> >> >> > > > that. User should get exception instead of hang. E.g.
> >Java's
> >> >> >"-Xmx"
> >> >> >> is
> >> >> >> > > > typically 25% of RAM - more adequate value, comparing to
> >> >> >Ignite.
> >> >> >> > > >
> >> >> >> > >
> >> >> >> > > Right, the developer was educated about the Java heap
> >> >parameters
> >> >> >and
> >> >> >> > > limited the overall space preferring OOM to the laptop
> >> >> >suspension. Who
> >> >> >> > > knows how he got to the point that 25% RAM should be used.
> >> >That
> >> >> >might
> >> >> >> > have
> >> >> >> > > been deep knowledge about JVM or he faced several hangs
> >while
> >> >> >testing
> >> >> >> the
> >> >> >> > > application.
> >> >> >> > >
> >> >> >> > > Anyway, JVM creators didn’t decide to predefine the Java
> >heap
> >> >to
> >> >> >a
> >> >> >> static
> >> >> >> > > value to avoid the situations like above. So should not we
> >as
> >> >a
> >> >> >> platform.
> >> >> >> > > Educate people about the Ignite memory behavior like Sun
> >did
> >> >for
> >> >> >the
> >> >> >> Java
> >> >> >> > > heap but do not try to solve the lack of knowledge with the
> >> >> >default
> >> >> >> > static
> >> >> >> > > memory size.
> >> >> >> > >
> >> >> >> > >
> >> >> >> > > > It doesn't matter whether you use persistence or not.
> >> >> >Persistent case
> >> >> >> > > just
> >> >> >> > > > makes this flaw more obvious - you have virtually
> >unlimited
> >> >> >disk, and
> >> >> >> > yet
> >> >> >> > > > you end up with swapping and hang when using Ignite with
> >> >> >default
> >> >> >> > > > configuration. As already explained, the problem is not
> >> >about
> >> >> >> > allocating
> >> >> >> > > > "maxMemory" right away, but about the value of
> >"maxMemory" -
> >> >it
> >> >> >is
> >> >> >> too
> >> >> >> > > big.
> >> >> >> > > >
> >> >> >> > >
> >> >> >> > > How do you know what should be the default then? Why 1 GB?
> >For
> >> >> >> instance,
> >> >> >> > > if I end up having only 1 GB of free memory left and try to
> >> >start
> >> >> >2
> >> >> >> > server
> >> >> >> > > nodes and an application I will face the laptop suspension
> >> >again.
> >> >> >> > >
> >> >> >> > > —
> >> >> >> > > Denis
> >> >> >> > >
> >> >> >> > > > "We had this behavior before" is never an argument.
> >Previous
> >> >> >offheap
> >> >> >> > > > implementation had a lot of flaws, so let's just forget
> >> >about
> >> >> >it.
> >> >> >> > > >
> >> >> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
> >> ><[hidden email]>
> >> >> >> wrote:
> >> >> >> > > >
> >> >> >> > > >> Sergey,
> >> >> >> > > >>
> >> >> >> > > >> That’s expectable because as we revealed from this
> >> >discussion
> >> >> >the
> >> >> >> > > >> allocation works different depending on whether the
> >> >> >persistence is
> >> >> >> > used
> >> >> >> > > or
> >> >> >> > > >> not:
> >> >> >> > > >>
> >> >> >> > > >> 1) In-memory mode (the persistence is disabled) - the
> >space
> >> >> >will be
> >> >> >> > > >> allocated incrementally until the max threshold is
> >reached.
> >> >> >Good!
> >> >> >> > > >>
> >> >> >> > > >> 2) The persistence mode - the whole space (limited by
> >the
> >> >max
> >> >> >> > threshold)
> >> >> >> > > >> is allocated right away. It’s not surprising that your
> >> >laptop
> >> >> >starts
> >> >> >> > > >> choking.
> >> >> >> > > >>
> >> >> >> > > >> So, in my previous response I tried to explain that I
> >can’t
> >> >> >find any
> >> >> >> > > >> reason why we should adjust 1). Any reasons except for
> >the
> >> >> >massive
> >> >> >> > > >> preloading?
> >> >> >> > > >>
> >> >> >> > > >> As for 2), that was a big surprise to reveal this after
> >2.1
> >> >> >release.
> >> >> >> > > >> Definitely we have to fix this somehow.
> >> >> >> > > >>
> >> >> >> > > >> —
> >> >> >> > > >> Denis
> >> >> >> > > >>
> >> >> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
> >> >> >> > [hidden email]
> >> >> >> > > >
> >> >> >> > > >> wrote:
> >> >> >> > > >>>
> >> >> >> > > >>> Denis,
> >> >> >> > > >>>
> >> >> >> > > >>> Just a simple example from our own codebase: I tried to
> >> >> >execute
> >> >> >> > > >>> PersistentStoreExample with default settings and two
> >> >server
> >> >> >nodes
> >> >> >> and
> >> >> >> > > >>> client node got frozen even on initial load of data
> >into
> >> >the
> >> >> >grid.
> >> >> >> > > >>> Although with one server node the example finishes
> >pretty
> >> >> >quickly.
> >> >> >> > > >>>
> >> >> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
> >> >> >memory, but
> >> >> >> it
> >> >> >> > > >>> cannot deal with it.
> >> >> >> > > >>>
> >> >> >> > > >>>
> >> >> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
> >> >> ><[hidden email]>
> >> >> >> > wrote:
> >> >> >> > > >>>
> >> >> >> > > >>>>> As far as allocating 80% of available RAM - I was
> >> >against
> >> >> >this
> >> >> >> even
> >> >> >> > > for
> >> >> >> > > >>>>> In-memory mode and still think that this is a wrong
> >> >> >default.
> >> >> >> > Looking
> >> >> >> > > at
> >> >> >> > > >>>>> free RAM is even worse because it gives you undefined
> >> >> >behavior.
> >> >> >> > > >>>>
> >> >> >> > > >>>> Guys, I can not understand how this dynamic memory
> >> >> >allocation's
> >> >> >> > > >> high-level
> >> >> >> > > >>>> behavior (with the persistence DISABLED) is different
> >> >from
> >> >> >the
> >> >> >> > legacy
> >> >> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
> >> >> >allocate the
> >> >> >> > > >> space on
> >> >> >> > > >>>> demand, the current just does this more aggressively
> >> >> >requesting
> >> >> >> big
> >> >> >> > > >> chunks.
> >> >> >> > > >>>>
> >> >> >> > > >>>> Next, the legacy one was unlimited by default and the
> >> >user
> >> >> >can
> >> >> >> start
> >> >> >> > > as
> >> >> >> > > >>>> many nodes as he wanted on a laptop and preload as
> >much
> >> >data
> >> >> >as he
> >> >> >> > > >> needed.
> >> >> >> > > >>>> Sure he could bring down the laptop if too many
> >entries
> >> >were
> >> >> >> > injected
> >> >> >> > > >> into
> >> >> >> > > >>>> the local cluster. But that’s about too massive
> >> >preloading
> >> >> >and not
> >> >> >> > > >> caused
> >> >> >> > > >>>> by the ability of the legacy off-heap memory to grow
> >> >> >infinitely.
> >> >> >> The
> >> >> >> > > >> same
> >> >> >> > > >>>> preloading would cause a hang if the Java heap memory
> >> >mode
> >> >> >is
> >> >> >> used.
> >> >> >> > > >>>>
> >> >> >> > > >>>> The upshot is that the massive preloading of data on
> >the
> >> >> >local
> >> >> >> > laptop
> >> >> >> > > >>>> should not fixed with repealing of the dynamic memory
> >> >> >allocation.
> >> >> >> > > >>>> Is there any other reason why we have to use the
> >static
> >> >> >memory
> >> >> >> > > >> allocation
> >> >> >> > > >>>> for the case when the persistence is disabled? I think
> >> >the
> >> >> >case
> >> >> >> with
> >> >> >> > > the
> >> >> >> > > >>>> persistence should be reviewed separately.
> >> >> >> > > >>>>
> >> >> >> > > >>>> —
> >> >> >> > > >>>> Denis
> >> >> >> > > >>>>
> >> >> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
> >> >> >> > > >>>> [hidden email]> wrote:
> >> >> >> > > >>>>>
> >> >> >> > > >>>>> Dmitriy,
> >> >> >> > > >>>>>
> >> >> >> > > >>>>> The reason behind this is the need to to be able to
> >> >evict
> >> >> >and
> >> >> >> load
> >> >> >> > > >> pages
> >> >> >> > > >>>> to
> >> >> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer
> >mapping
> >> >in
> >> >> >> memory.
> >> >> >> > > In
> >> >> >> > > >>>>> order to do this in the most efficient way, we need
> >to
> >> >know
> >> >> >in
> >> >> >> > > advance
> >> >> >> > > >>>> all
> >> >> >> > > >>>>> the address ranges we work with. We can add dynamic
> >> >memory
> >> >> >> > extension
> >> >> >> > > >> for
> >> >> >> > > >>>>> persistence-enabled config, but this will add yet
> >> >another
> >> >> >step of
> >> >> >> > > >>>>> indirection when resolving every page address, which
> >> >adds a
> >> >> >> > > noticeable
> >> >> >> > > >>>>> performance penalty.
> >> >> >> > > >>>>>
> >> >> >> > > >>>>>
> >> >> >> > > >>>>>
> >> >> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
> >> >> >> > [hidden email]
> >> >> >> > > >:
> >> >> >> > > >>>>>
> >> >> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
> >> >> >> > > [hidden email]
> >> >> >> > > >>>
> >> >> >> > > >>>>>> wrote:
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>>> Dima,
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>> Probably folks who worked closely with storage know
> >> >why.
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>> Without knowing why, how can we make a decision?
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
> >> >about
> >> >> >not
> >> >> >> using
> >> >> >> > > >>>>>> increments? Do know remember what was the reason?
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>> The very problem is that before being started once
> >on
> >> >> >> production
> >> >> >> > > >>>>>>> environment, Ignite will typically be started
> >hundred
> >> >> >times on
> >> >> >> > > >>>>>> developer's
> >> >> >> > > >>>>>>> environment. I think that default should be ~10% of
> >> >total
> >> >> >RAM.
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>> Why not 80% of *free *RAM?
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan
> ><
> >> >> >> > > >>>>>> [hidden email]>
> >> >> >> > > >>>>>>> wrote:
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
> >> >> >> > > >> [hidden email]
> >> >> >> > > >>>>>
> >> >> >> > > >>>>>>>> wrote:
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>>> Please see original Sergey's message - when
> >> >persistence
> >> >> >is
> >> >> >> > > enabled,
> >> >> >> > > >>>>>>>> memory
> >> >> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
> >> >> >> > > >>>>>>>>>
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>> Why?
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>>> Default settings must allow for normal work on
> >> >> >developer's
> >> >> >> > > >>>>>> environment.
> >> >> >> > > >>>>>>>>>
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>> Agree, but why not in increments?
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>>>
> >> >> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
> >> >> ><[hidden email]>:
> >> >> >> > > >>>>>>>>>
> >> >> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
> >> >> >> > > >>>>>>>>>>
> >> >> >> > > >>>>>>>>>> This is exactly how the allocation works right
> >now.
> >> >> >The
> >> >> >> memory
> >> >> >> > > >> will
> >> >> >> > > >>>>>>>> grow
> >> >> >> > > >>>>>>>>>> incrementally until the max size is reached (80%
> >of
> >> >> >RAM by
> >> >> >> > > >>>>>> default).
> >> >> >> > > >>>>>>>>>>
> >> >> >> > > >>>>>>>>>> —
> >> >> >> > > >>>>>>>>>> Denis
> >> >> >> > > >>>>>>>>>>
> >> >> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM,
> >[hidden email]
> >> >> >wrote:
> >> >> >> > > >>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
> >> >frankly i
> >> >> >do
> >> >> >> not
> >> >> >> > > want
> >> >> >> > > >>>>>>> t o
> >> >> >> > > >>>>>>>>>> guess. Why not allocate in increments
> >> >automatically?
> >> >> >> > > >>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>> ⁣D.​
> >> >> >> > > >>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
> >> >> >Ozerov <
> >> >> >> > > >>>>>>>>>> [hidden email]> wrote:
> >> >> >> > > >>>>>>>>>>>> Denis,
> >> >> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
> >> >with
> >> >> >> > > persistence,
> >> >> >> > > >>>>>>> when
> >> >> >> > > >>>>>>>>>>>> 80% of
> >> >> >> > > >>>>>>>>>>>> RAM is allocated right away, was released
> >several
> >> >> >days
> >> >> >> ago.
> >> >> >> > > How
> >> >> >> > > >>>>>> do
> >> >> >> > > >>>>>>>> you
> >> >> >> > > >>>>>>>>>>>> think, how many users tried it already?
> >> >> >> > > >>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>> Guys,
> >> >> >> > > >>>>>>>>>>>> Do you really think allocating 80% of
> >available
> >> >RAM
> >> >> >is a
> >> >> >> > > normal
> >> >> >> > > >>>>>>>> thing?
> >> >> >> > > >>>>>>>>>>>> Take
> >> >> >> > > >>>>>>>>>>>> your laptop and check how many available RAM
> >you
> >> >> >have
> >> >> >> right
> >> >> >> > > now.
> >> >> >> > > >>>>>>> Do
> >> >> >> > > >>>>>>>>> you
> >> >> >> > > >>>>>>>>>>>> fit
> >> >> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
> >> >> >persistence
> >> >> >> > > with
> >> >> >> > > >>>>>>> all
> >> >> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
> >> >> >insane. We
> >> >> >> > > shold
> >> >> >> > > >>>>>>>>>>>> allocate no
> >> >> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
> >> >without
> >> >> >any
> >> >> >> > > >>>>>> problems.
> >> >> >> > > >>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
> >> >> >> > > [hidden email]
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>>>> wrote:
> >> >> >> > > >>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
> >> >that
> >> >> >80% is
> >> >> >> > too
> >> >> >> > > >>>>>>>>>>>> aggressive
> >> >> >> > > >>>>>>>>>>>>> to bring it down.
> >> >> >> > > >>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of
> >the
> >> >80%
> >> >> >RAM
> >> >> >> > > >>>>>>> allocation
> >> >> >> > > >>>>>>>> on
> >> >> >> > > >>>>>>>>>>>> 64
> >> >> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32
> >bit
> >> >JVM.
> >> >> >I’ve
> >> >> >> > not
> >> >> >> > > >>>>>>> heard
> >> >> >> > > >>>>>>>> of
> >> >> >> > > >>>>>>>>>>>> any
> >> >> >> > > >>>>>>>>>>>>> other complaints in regards the default
> >> >allocation
> >> >> >size.
> >> >> >> > > >>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>> —
> >> >> >> > > >>>>>>>>>>>>> Denis
> >> >> >> > > >>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
> >> >[hidden email]
> >> >> >> wrote:
> >> >> >> > > >>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>> I prefer option #1.
> >> >> >> > > >>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>> ⁣D.​
> >> >> >> > > >>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM,
> >Sergey
> >> >> >Chugunov <
> >> >> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
> >> >> >> > > >>>>>>>>>>>>>>> Folks,
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> I would like to get back to the question
> >about
> >> >> >> > MemoryPolicy
> >> >> >> > > >>>>>>>>>>>> maxMemory
> >> >> >> > > >>>>>>>>>>>>>>> defaults.
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured
> >with
> >> >> >initial
> >> >> >> and
> >> >> >> > > >>>>>>>> maxMemory
> >> >> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
> >> >MemoryPolicy
> >> >> >always
> >> >> >> > > >>>>>>> allocates
> >> >> >> > > >>>>>>>>>>>>>>> maxMemory
> >> >> >> > > >>>>>>>>>>>>>>> size for performance reasons.
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
> >> >physical
> >> >> >memory
> >> >> >> it
> >> >> >> > > >>>>>>> causes
> >> >> >> > > >>>>>>>>>>>> OOME
> >> >> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on
> >OS
> >> >or
> >> >> >JVM
> >> >> >> > level)
> >> >> >> > > >>>>>> and
> >> >> >> > > >>>>>>>>>>>> hurts
> >> >> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
> >> >nodes
> >> >> >are
> >> >> >> > > started
> >> >> >> > > >>>>>> on
> >> >> >> > > >>>>>>>>>>>> the
> >> >> >> > > >>>>>>>>>>>>>>> same
> >> >> >> > > >>>>>>>>>>>>>>> physical server.
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and
> >switch
> >> >to
> >> >> >other
> >> >> >> > > >>>>>>> options:
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits
> >and
> >> >> >adapt
> >> >> >> > > defaults.
> >> >> >> > > >>>>>>> In
> >> >> >> > > >>>>>>>>>>>> this
> >> >> >> > > >>>>>>>>>>>>>>> case we still need to address the issue
> >with
> >> >> >multiple
> >> >> >> > nodes
> >> >> >> > > >>>>>> on
> >> >> >> > > >>>>>>>> one
> >> >> >> > > >>>>>>>>>>>>>>> machine
> >> >> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and
> >allocate,
> >> >for
> >> >> >> > instance,
> >> >> >> > > >>>>>>>>>>>> max(0.3 *
> >> >> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
> >> >> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues
> >with
> >> >> >starting
> >> >> >> > on
> >> >> >> > > 32
> >> >> >> > > >>>>>>> bit
> >> >> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with
> >multiple
> >> >> >nodes on
> >> >> >> > the
> >> >> >> > > >>>>>>> same
> >> >> >> > > >>>>>>>>>>>>>>> machine.
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
> >> >> >> > > >>>>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>>> Thanks,
> >> >> >> > > >>>>>>>>>>>>>>> Sergey.
> >> >> >> > > >>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>>>>
> >> >> >> > > >>>>>>>>>>
> >> >> >> > > >>>>>>>>>>
> >> >> >> > > >>>>>>>>>
> >> >> >> > > >>>>>>>>
> >> >> >> > > >>>>>>>
> >> >> >> > > >>>>>>
> >> >> >> > > >>>>
> >> >> >> > > >>>>
> >> >> >> > > >>
> >> >> >> > > >>
> >> >> >> > >
> >> >> >> > >
> >> >> >> >
> >> >> >>
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Sergey Chugunov
Folks,

I filed a ticket [1] to address the concern of starting too many nodes.
Please review it.

[1] https://issues.apache.org/jira/browse/IGNITE-6003

Thanks,
Sergey.

On Mon, Aug 7, 2017 at 12:53 PM, Sergey Chugunov <[hidden email]>
wrote:

> Dmitriy,
>
> When Ignite node "allocates memory" it actually just reserves a chunk in
> its address space, almost no physical RAM is used.
>
> I can easily start half a dozen of ignite nodes with current defaults on
> my laptop with only 16 Gigs of RAM; and each node will "allocate" around 12
> Gigs; 72 gigabytes in total.
> The laptop will do easily with it so far I don't stream any data to the
> grid.
>
> But when I put some pressure to the grid, massive swapping of memory pages
> will show up as OS begins trying to keep a huge amount of pages of
> different processes in memory.
>
> So indicator "we are running out of memory" just doesn't work here.
>
> Thanks,
> Sergey.
>
> On Fri, Aug 4, 2017 at 1:01 PM, <[hidden email]> wrote:
>
>> But why? We allocate the memory, so we should know when it runs out. What
>> am i missing?
>>
>> ⁣D.​
>>
>> On Aug 4, 2017, 11:55 AM, at 11:55 AM, Sergey Chugunov <
>> [hidden email]> wrote:
>> >I used GC and java only as an example, they are not applicable to
>> >Ignite
>> >case where we manage offheap memory.
>> >
>> >My point is that there is no easy way to implement this feature in
>> >Ignite,
>> >and more time is needed to properly design it and account for all
>> >risks.
>> >
>> >Thanks,
>> >Sergey.
>> >
>> >On Fri, Aug 4, 2017 at 12:44 PM, <[hidden email]> wrote:
>> >
>> >> Hang on. I thought we were talking about offheap size, GC should not
>> >be
>> >> relevant. Am I wrong?
>> >>
>> >> ⁣D.​
>> >>
>> >> On Aug 4, 2017, 11:38 AM, at 11:38 AM, Sergey Chugunov <
>> >> [hidden email]> wrote:
>> >> >Do you see an obvious way of implementing it?
>> >> >
>> >> >In java there is a heap and GC working on it. And for instance, it
>> >is
>> >> >possible to make a decision to throw an OOM based on some gc
>> >metrics.
>> >> >
>> >> >I may be wrong but I don't see a mechanism in Ignite to use it right
>> >> >away
>> >> >for such purposes.
>> >> >And implementing something without thorough planning brings huge
>> >risk
>> >> >of
>> >> >false positives with nodes stopping when they don't have to.
>> >> >
>> >> >That's why I think it must be implemented and intensively tested as
>> >> >part of
>> >> >a separate ticket.
>> >> >
>> >> >Thanks,
>> >> >Sergey.
>> >> >
>> >> >On Fri, Aug 4, 2017 at 12:18 PM, <[hidden email]> wrote:
>> >> >
>> >> >> Without #3, the #1 and #2 make little sense.
>> >> >>
>> >> >> Why is #3 so difficult?
>> >> >>
>> >> >> ⁣D.​
>> >> >>
>> >> >> On Aug 4, 2017, 10:46 AM, at 10:46 AM, Sergey Chugunov <
>> >> >> [hidden email]> wrote:
>> >> >> >Dmitriy,
>> >> >> >
>> >> >> >Last item makes perfect sense to me, one may think of it as an
>> >> >> >"OutOfMemoryException" in java.
>> >> >> >However, it looks like such feature requires considerable efforts
>> >to
>> >> >> >properly design and implement it, so I would propose to create a
>> >> >> >separate
>> >> >> >ticket and agree upon target version for it.
>> >> >> >
>> >> >> >Items #1 and #2 will be implemented under IGNITE-5717. Makes
>> >sense?
>> >> >> >
>> >> >> >Thanks,
>> >> >> >Sergey.
>> >> >> >
>> >> >> >On Thu, Aug 3, 2017 at 4:34 AM, Dmitriy Setrakyan
>> >> >> ><[hidden email]>
>> >> >> >wrote:
>> >> >> >
>> >> >> >> Here is what we should do:
>> >> >> >>
>> >> >> >>    1. Pick an acceptable number. Does not matter if it is 10%
>> >or
>> >> >50%.
>> >> >> >>    2. Print the allocated memory in *BOLD* letters into the
>> >log.
>> >> >> >>    3. Make sure that Ignite server never hangs due to the low
>> >> >memory
>> >> >> >issue.
>> >> >> >>    We should sense it and kick the node out automatically,
>> >again
>> >> >with
>> >> >> >a
>> >> >> >> *BOLD*
>> >> >> >>    message in the log.
>> >> >> >>
>> >> >> >>  Is this possible?
>> >> >> >>
>> >> >> >> D.
>> >> >> >>
>> >> >> >> On Wed, Aug 2, 2017 at 6:09 PM, Vladimir Ozerov
>> >> >> ><[hidden email]>
>> >> >> >> wrote:
>> >> >> >>
>> >> >> >> > My proposal is 10% instead of 80%.
>> >> >> >> >
>> >> >> >> > ср, 2 авг. 2017 г. в 18:54, Denis Magda <[hidden email]>:
>> >> >> >> >
>> >> >> >> > > Vladimir, Dmitriy P.,
>> >> >> >> > >
>> >> >> >> > > Please see inline
>> >> >> >> > >
>> >> >> >> > > > On Aug 2, 2017, at 7:20 AM, Vladimir Ozerov
>> >> >> ><[hidden email]>
>> >> >> >> > > wrote:
>> >> >> >> > > >
>> >> >> >> > > > Denis,
>> >> >> >> > > >
>> >> >> >> > > > The reason is that product should not hang user's
>> >computer.
>> >> >How
>> >> >> >else
>> >> >> >> > this
>> >> >> >> > > > could be explained? I am developer. I start Ignite, 1
>> >node,
>> >> >2
>> >> >> >nodes,
>> >> >> >> X
>> >> >> >> > > > nodes, observe how they join topology. Add one key, 10
>> >keys,
>> >> >1M
>> >> >> >keys.
>> >> >> >> > > Then
>> >> >> >> > > > I do a bug in example and load 100M keys accidentally -
>> >> >restart
>> >> >> >the
>> >> >> >> > > > computer. Correct behavior is to have small "maxMemory"
>> >by
>> >> >> >default to
>> >> >> >> > > avoid
>> >> >> >> > > > that. User should get exception instead of hang. E.g.
>> >Java's
>> >> >> >"-Xmx"
>> >> >> >> is
>> >> >> >> > > > typically 25% of RAM - more adequate value, comparing to
>> >> >> >Ignite.
>> >> >> >> > > >
>> >> >> >> > >
>> >> >> >> > > Right, the developer was educated about the Java heap
>> >> >parameters
>> >> >> >and
>> >> >> >> > > limited the overall space preferring OOM to the laptop
>> >> >> >suspension. Who
>> >> >> >> > > knows how he got to the point that 25% RAM should be used.
>> >> >That
>> >> >> >might
>> >> >> >> > have
>> >> >> >> > > been deep knowledge about JVM or he faced several hangs
>> >while
>> >> >> >testing
>> >> >> >> the
>> >> >> >> > > application.
>> >> >> >> > >
>> >> >> >> > > Anyway, JVM creators didn’t decide to predefine the Java
>> >heap
>> >> >to
>> >> >> >a
>> >> >> >> static
>> >> >> >> > > value to avoid the situations like above. So should not we
>> >as
>> >> >a
>> >> >> >> platform.
>> >> >> >> > > Educate people about the Ignite memory behavior like Sun
>> >did
>> >> >for
>> >> >> >the
>> >> >> >> Java
>> >> >> >> > > heap but do not try to solve the lack of knowledge with the
>> >> >> >default
>> >> >> >> > static
>> >> >> >> > > memory size.
>> >> >> >> > >
>> >> >> >> > >
>> >> >> >> > > > It doesn't matter whether you use persistence or not.
>> >> >> >Persistent case
>> >> >> >> > > just
>> >> >> >> > > > makes this flaw more obvious - you have virtually
>> >unlimited
>> >> >> >disk, and
>> >> >> >> > yet
>> >> >> >> > > > you end up with swapping and hang when using Ignite with
>> >> >> >default
>> >> >> >> > > > configuration. As already explained, the problem is not
>> >> >about
>> >> >> >> > allocating
>> >> >> >> > > > "maxMemory" right away, but about the value of
>> >"maxMemory" -
>> >> >it
>> >> >> >is
>> >> >> >> too
>> >> >> >> > > big.
>> >> >> >> > > >
>> >> >> >> > >
>> >> >> >> > > How do you know what should be the default then? Why 1 GB?
>> >For
>> >> >> >> instance,
>> >> >> >> > > if I end up having only 1 GB of free memory left and try to
>> >> >start
>> >> >> >2
>> >> >> >> > server
>> >> >> >> > > nodes and an application I will face the laptop suspension
>> >> >again.
>> >> >> >> > >
>> >> >> >> > > —
>> >> >> >> > > Denis
>> >> >> >> > >
>> >> >> >> > > > "We had this behavior before" is never an argument.
>> >Previous
>> >> >> >offheap
>> >> >> >> > > > implementation had a lot of flaws, so let's just forget
>> >> >about
>> >> >> >it.
>> >> >> >> > > >
>> >> >> >> > > > On Wed, Aug 2, 2017 at 5:08 PM, Denis Magda
>> >> ><[hidden email]>
>> >> >> >> wrote:
>> >> >> >> > > >
>> >> >> >> > > >> Sergey,
>> >> >> >> > > >>
>> >> >> >> > > >> That’s expectable because as we revealed from this
>> >> >discussion
>> >> >> >the
>> >> >> >> > > >> allocation works different depending on whether the
>> >> >> >persistence is
>> >> >> >> > used
>> >> >> >> > > or
>> >> >> >> > > >> not:
>> >> >> >> > > >>
>> >> >> >> > > >> 1) In-memory mode (the persistence is disabled) - the
>> >space
>> >> >> >will be
>> >> >> >> > > >> allocated incrementally until the max threshold is
>> >reached.
>> >> >> >Good!
>> >> >> >> > > >>
>> >> >> >> > > >> 2) The persistence mode - the whole space (limited by
>> >the
>> >> >max
>> >> >> >> > threshold)
>> >> >> >> > > >> is allocated right away. It’s not surprising that your
>> >> >laptop
>> >> >> >starts
>> >> >> >> > > >> choking.
>> >> >> >> > > >>
>> >> >> >> > > >> So, in my previous response I tried to explain that I
>> >can’t
>> >> >> >find any
>> >> >> >> > > >> reason why we should adjust 1). Any reasons except for
>> >the
>> >> >> >massive
>> >> >> >> > > >> preloading?
>> >> >> >> > > >>
>> >> >> >> > > >> As for 2), that was a big surprise to reveal this after
>> >2.1
>> >> >> >release.
>> >> >> >> > > >> Definitely we have to fix this somehow.
>> >> >> >> > > >>
>> >> >> >> > > >> —
>> >> >> >> > > >> Denis
>> >> >> >> > > >>
>> >> >> >> > > >>> On Aug 2, 2017, at 6:59 AM, Sergey Chugunov <
>> >> >> >> > [hidden email]
>> >> >> >> > > >
>> >> >> >> > > >> wrote:
>> >> >> >> > > >>>
>> >> >> >> > > >>> Denis,
>> >> >> >> > > >>>
>> >> >> >> > > >>> Just a simple example from our own codebase: I tried to
>> >> >> >execute
>> >> >> >> > > >>> PersistentStoreExample with default settings and two
>> >> >server
>> >> >> >nodes
>> >> >> >> and
>> >> >> >> > > >>> client node got frozen even on initial load of data
>> >into
>> >> >the
>> >> >> >grid.
>> >> >> >> > > >>> Although with one server node the example finishes
>> >pretty
>> >> >> >quickly.
>> >> >> >> > > >>>
>> >> >> >> > > >>> And my laptop isn't the weakest one and has 16 gigs of
>> >> >> >memory, but
>> >> >> >> it
>> >> >> >> > > >>> cannot deal with it.
>> >> >> >> > > >>>
>> >> >> >> > > >>>
>> >> >> >> > > >>> On Wed, Aug 2, 2017 at 4:58 PM, Denis Magda
>> >> >> ><[hidden email]>
>> >> >> >> > wrote:
>> >> >> >> > > >>>
>> >> >> >> > > >>>>> As far as allocating 80% of available RAM - I was
>> >> >against
>> >> >> >this
>> >> >> >> even
>> >> >> >> > > for
>> >> >> >> > > >>>>> In-memory mode and still think that this is a wrong
>> >> >> >default.
>> >> >> >> > Looking
>> >> >> >> > > at
>> >> >> >> > > >>>>> free RAM is even worse because it gives you undefined
>> >> >> >behavior.
>> >> >> >> > > >>>>
>> >> >> >> > > >>>> Guys, I can not understand how this dynamic memory
>> >> >> >allocation's
>> >> >> >> > > >> high-level
>> >> >> >> > > >>>> behavior (with the persistence DISABLED) is different
>> >> >from
>> >> >> >the
>> >> >> >> > legacy
>> >> >> >> > > >>>> off-heap memory we had in 1.x. Both off-heap memories
>> >> >> >allocate the
>> >> >> >> > > >> space on
>> >> >> >> > > >>>> demand, the current just does this more aggressively
>> >> >> >requesting
>> >> >> >> big
>> >> >> >> > > >> chunks.
>> >> >> >> > > >>>>
>> >> >> >> > > >>>> Next, the legacy one was unlimited by default and the
>> >> >user
>> >> >> >can
>> >> >> >> start
>> >> >> >> > > as
>> >> >> >> > > >>>> many nodes as he wanted on a laptop and preload as
>> >much
>> >> >data
>> >> >> >as he
>> >> >> >> > > >> needed.
>> >> >> >> > > >>>> Sure he could bring down the laptop if too many
>> >entries
>> >> >were
>> >> >> >> > injected
>> >> >> >> > > >> into
>> >> >> >> > > >>>> the local cluster. But that’s about too massive
>> >> >preloading
>> >> >> >and not
>> >> >> >> > > >> caused
>> >> >> >> > > >>>> by the ability of the legacy off-heap memory to grow
>> >> >> >infinitely.
>> >> >> >> The
>> >> >> >> > > >> same
>> >> >> >> > > >>>> preloading would cause a hang if the Java heap memory
>> >> >mode
>> >> >> >is
>> >> >> >> used.
>> >> >> >> > > >>>>
>> >> >> >> > > >>>> The upshot is that the massive preloading of data on
>> >the
>> >> >> >local
>> >> >> >> > laptop
>> >> >> >> > > >>>> should not fixed with repealing of the dynamic memory
>> >> >> >allocation.
>> >> >> >> > > >>>> Is there any other reason why we have to use the
>> >static
>> >> >> >memory
>> >> >> >> > > >> allocation
>> >> >> >> > > >>>> for the case when the persistence is disabled? I think
>> >> >the
>> >> >> >case
>> >> >> >> with
>> >> >> >> > > the
>> >> >> >> > > >>>> persistence should be reviewed separately.
>> >> >> >> > > >>>>
>> >> >> >> > > >>>> —
>> >> >> >> > > >>>> Denis
>> >> >> >> > > >>>>
>> >> >> >> > > >>>>> On Aug 2, 2017, at 12:45 AM, Alexey Goncharuk <
>> >> >> >> > > >>>> [hidden email]> wrote:
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>> Dmitriy,
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>> The reason behind this is the need to to be able to
>> >> >evict
>> >> >> >and
>> >> >> >> load
>> >> >> >> > > >> pages
>> >> >> >> > > >>>> to
>> >> >> >> > > >>>>> disk, thus we need to preserve a PageId->Pointer
>> >mapping
>> >> >in
>> >> >> >> memory.
>> >> >> >> > > In
>> >> >> >> > > >>>>> order to do this in the most efficient way, we need
>> >to
>> >> >know
>> >> >> >in
>> >> >> >> > > advance
>> >> >> >> > > >>>> all
>> >> >> >> > > >>>>> the address ranges we work with. We can add dynamic
>> >> >memory
>> >> >> >> > extension
>> >> >> >> > > >> for
>> >> >> >> > > >>>>> persistence-enabled config, but this will add yet
>> >> >another
>> >> >> >step of
>> >> >> >> > > >>>>> indirection when resolving every page address, which
>> >> >adds a
>> >> >> >> > > noticeable
>> >> >> >> > > >>>>> performance penalty.
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>> 2017-08-02 10:37 GMT+03:00 Dmitriy Setrakyan <
>> >> >> >> > [hidden email]
>> >> >> >> > > >:
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>>> On Wed, Aug 2, 2017 at 9:33 AM, Vladimir Ozerov <
>> >> >> >> > > [hidden email]
>> >> >> >> > > >>>
>> >> >> >> > > >>>>>> wrote:
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>>> Dima,
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>> Probably folks who worked closely with storage know
>> >> >why.
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>> Without knowing why, how can we make a decision?
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>> Alexey Goncharuk, was it you who made the decision
>> >> >about
>> >> >> >not
>> >> >> >> using
>> >> >> >> > > >>>>>> increments? Do know remember what was the reason?
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>> The very problem is that before being started once
>> >on
>> >> >> >> production
>> >> >> >> > > >>>>>>> environment, Ignite will typically be started
>> >hundred
>> >> >> >times on
>> >> >> >> > > >>>>>> developer's
>> >> >> >> > > >>>>>>> environment. I think that default should be ~10% of
>> >> >total
>> >> >> >RAM.
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>> Why not 80% of *free *RAM?
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>> On Wed, Aug 2, 2017 at 10:21 AM, Dmitriy Setrakyan
>> ><
>> >> >> >> > > >>>>>> [hidden email]>
>> >> >> >> > > >>>>>>> wrote:
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>>> On Wed, Aug 2, 2017 at 7:27 AM, Vladimir Ozerov <
>> >> >> >> > > >> [hidden email]
>> >> >> >> > > >>>>>
>> >> >> >> > > >>>>>>>> wrote:
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>>> Please see original Sergey's message - when
>> >> >persistence
>> >> >> >is
>> >> >> >> > > enabled,
>> >> >> >> > > >>>>>>>> memory
>> >> >> >> > > >>>>>>>>> is not allocated incrementally, maxSize is used.
>> >> >> >> > > >>>>>>>>>
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>> Why?
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>>> Default settings must allow for normal work on
>> >> >> >developer's
>> >> >> >> > > >>>>>> environment.
>> >> >> >> > > >>>>>>>>>
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>> Agree, but why not in increments?
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>>>
>> >> >> >> > > >>>>>>>>> ср, 2 авг. 2017 г. в 1:10, Denis Magda
>> >> >> ><[hidden email]>:
>> >> >> >> > > >>>>>>>>>
>> >> >> >> > > >>>>>>>>>>> Why not allocate in increments automatically?
>> >> >> >> > > >>>>>>>>>>
>> >> >> >> > > >>>>>>>>>> This is exactly how the allocation works right
>> >now.
>> >> >> >The
>> >> >> >> memory
>> >> >> >> > > >> will
>> >> >> >> > > >>>>>>>> grow
>> >> >> >> > > >>>>>>>>>> incrementally until the max size is reached (80%
>> >of
>> >> >> >RAM by
>> >> >> >> > > >>>>>> default).
>> >> >> >> > > >>>>>>>>>>
>> >> >> >> > > >>>>>>>>>> —
>> >> >> >> > > >>>>>>>>>> Denis
>> >> >> >> > > >>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>> On Aug 1, 2017, at 3:03 PM,
>> >[hidden email]
>> >> >> >wrote:
>> >> >> >> > > >>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>> Vova, 1GB seems a bit too small for me, and
>> >> >frankly i
>> >> >> >do
>> >> >> >> not
>> >> >> >> > > want
>> >> >> >> > > >>>>>>> t o
>> >> >> >> > > >>>>>>>>>> guess. Why not allocate in increments
>> >> >automatically?
>> >> >> >> > > >>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>> ⁣D.​
>> >> >> >> > > >>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>> On Aug 1, 2017, 11:03 PM, at 11:03 PM, Vladimir
>> >> >> >Ozerov <
>> >> >> >> > > >>>>>>>>>> [hidden email]> wrote:
>> >> >> >> > > >>>>>>>>>>>> Denis,
>> >> >> >> > > >>>>>>>>>>>> No doubts you haven't heard about it - AI 2.1
>> >> >with
>> >> >> >> > > persistence,
>> >> >> >> > > >>>>>>> when
>> >> >> >> > > >>>>>>>>>>>> 80% of
>> >> >> >> > > >>>>>>>>>>>> RAM is allocated right away, was released
>> >several
>> >> >> >days
>> >> >> >> ago.
>> >> >> >> > > How
>> >> >> >> > > >>>>>> do
>> >> >> >> > > >>>>>>>> you
>> >> >> >> > > >>>>>>>>>>>> think, how many users tried it already?
>> >> >> >> > > >>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>> Guys,
>> >> >> >> > > >>>>>>>>>>>> Do you really think allocating 80% of
>> >available
>> >> >RAM
>> >> >> >is a
>> >> >> >> > > normal
>> >> >> >> > > >>>>>>>> thing?
>> >> >> >> > > >>>>>>>>>>>> Take
>> >> >> >> > > >>>>>>>>>>>> your laptop and check how many available RAM
>> >you
>> >> >> >have
>> >> >> >> right
>> >> >> >> > > now.
>> >> >> >> > > >>>>>>> Do
>> >> >> >> > > >>>>>>>>> you
>> >> >> >> > > >>>>>>>>>>>> fit
>> >> >> >> > > >>>>>>>>>>>> to remaining 20%? If not, then running AI with
>> >> >> >persistence
>> >> >> >> > > with
>> >> >> >> > > >>>>>>> all
>> >> >> >> > > >>>>>>>>>>>> defaults will bring your machine down. This is
>> >> >> >insane. We
>> >> >> >> > > shold
>> >> >> >> > > >>>>>>>>>>>> allocate no
>> >> >> >> > > >>>>>>>>>>>> more than 1Gb, so that user can play with it
>> >> >without
>> >> >> >any
>> >> >> >> > > >>>>>> problems.
>> >> >> >> > > >>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>> On Tue, Aug 1, 2017 at 10:26 PM, Denis Magda <
>> >> >> >> > > [hidden email]
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>>>> wrote:
>> >> >> >> > > >>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>> My vote goes for option #1 too. I don’t think
>> >> >that
>> >> >> >80% is
>> >> >> >> > too
>> >> >> >> > > >>>>>>>>>>>> aggressive
>> >> >> >> > > >>>>>>>>>>>>> to bring it down.
>> >> >> >> > > >>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>> IGNITE-5717 was created to fix the issue of
>> >the
>> >> >80%
>> >> >> >RAM
>> >> >> >> > > >>>>>>> allocation
>> >> >> >> > > >>>>>>>> on
>> >> >> >> > > >>>>>>>>>>>> 64
>> >> >> >> > > >>>>>>>>>>>>> bit systems when Ignite works on top of 32
>> >bit
>> >> >JVM.
>> >> >> >I’ve
>> >> >> >> > not
>> >> >> >> > > >>>>>>> heard
>> >> >> >> > > >>>>>>>> of
>> >> >> >> > > >>>>>>>>>>>> any
>> >> >> >> > > >>>>>>>>>>>>> other complaints in regards the default
>> >> >allocation
>> >> >> >size.
>> >> >> >> > > >>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>> —
>> >> >> >> > > >>>>>>>>>>>>> Denis
>> >> >> >> > > >>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, at 10:58 AM,
>> >> >[hidden email]
>> >> >> >> wrote:
>> >> >> >> > > >>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>> I prefer option #1.
>> >> >> >> > > >>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>> ⁣D.​
>> >> >> >> > > >>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>> On Aug 1, 2017, 11:20 AM, at 11:20 AM,
>> >Sergey
>> >> >> >Chugunov <
>> >> >> >> > > >>>>>>>>>>>>> [hidden email]> wrote:
>> >> >> >> > > >>>>>>>>>>>>>>> Folks,
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> I would like to get back to the question
>> >about
>> >> >> >> > MemoryPolicy
>> >> >> >> > > >>>>>>>>>>>> maxMemory
>> >> >> >> > > >>>>>>>>>>>>>>> defaults.
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> Although MemoryPolicy may be configured
>> >with
>> >> >> >initial
>> >> >> >> and
>> >> >> >> > > >>>>>>>> maxMemory
>> >> >> >> > > >>>>>>>>>>>>>>> settings, when persistence is used
>> >> >MemoryPolicy
>> >> >> >always
>> >> >> >> > > >>>>>>> allocates
>> >> >> >> > > >>>>>>>>>>>>>>> maxMemory
>> >> >> >> > > >>>>>>>>>>>>>>> size for performance reasons.
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> As default size of maxMemory is 80% of
>> >> >physical
>> >> >> >memory
>> >> >> >> it
>> >> >> >> > > >>>>>>> causes
>> >> >> >> > > >>>>>>>>>>>> OOME
>> >> >> >> > > >>>>>>>>>>>>>>> exceptions of 32 bit platforms (either on
>> >OS
>> >> >or
>> >> >> >JVM
>> >> >> >> > level)
>> >> >> >> > > >>>>>> and
>> >> >> >> > > >>>>>>>>>>>> hurts
>> >> >> >> > > >>>>>>>>>>>>>>> performance in setups when multiple Ignite
>> >> >nodes
>> >> >> >are
>> >> >> >> > > started
>> >> >> >> > > >>>>>> on
>> >> >> >> > > >>>>>>>>>>>> the
>> >> >> >> > > >>>>>>>>>>>>>>> same
>> >> >> >> > > >>>>>>>>>>>>>>> physical server.
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> I suggest to rethink these defaults and
>> >switch
>> >> >to
>> >> >> >other
>> >> >> >> > > >>>>>>> options:
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> - Check whether platform is 32 or 64 bits
>> >and
>> >> >> >adapt
>> >> >> >> > > defaults.
>> >> >> >> > > >>>>>>> In
>> >> >> >> > > >>>>>>>>>>>> this
>> >> >> >> > > >>>>>>>>>>>>>>> case we still need to address the issue
>> >with
>> >> >> >multiple
>> >> >> >> > nodes
>> >> >> >> > > >>>>>> on
>> >> >> >> > > >>>>>>>> one
>> >> >> >> > > >>>>>>>>>>>>>>> machine
>> >> >> >> > > >>>>>>>>>>>>>>> even on 64 bit systems.
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> - Lower defaults for maxMemory and
>> >allocate,
>> >> >for
>> >> >> >> > instance,
>> >> >> >> > > >>>>>>>>>>>> max(0.3 *
>> >> >> >> > > >>>>>>>>>>>>>>> availableMemory, 1Gb).
>> >> >> >> > > >>>>>>>>>>>>>>> This option allows us to solve all issues
>> >with
>> >> >> >starting
>> >> >> >> > on
>> >> >> >> > > 32
>> >> >> >> > > >>>>>>> bit
>> >> >> >> > > >>>>>>>>>>>>>>> platforms and reduce instability with
>> >multiple
>> >> >> >nodes on
>> >> >> >> > the
>> >> >> >> > > >>>>>>> same
>> >> >> >> > > >>>>>>>>>>>>>>> machine.
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> Thoughts and/or other options?
>> >> >> >> > > >>>>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>>> Thanks,
>> >> >> >> > > >>>>>>>>>>>>>>> Sergey.
>> >> >> >> > > >>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>
>> >> >> >> > > >>>>>>>>>>
>> >> >> >> > > >>>>>>>>>
>> >> >> >> > > >>>>>>>>
>> >> >> >> > > >>>>>>>
>> >> >> >> > > >>>>>>
>> >> >> >> > > >>>>
>> >> >> >> > > >>>>
>> >> >> >> > > >>
>> >> >> >> > > >>
>> >> >> >> > >
>> >> >> >> > >
>> >> >> >> >
>> >> >> >>
>> >> >>
>> >>
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

yzhdanov
Sergey, ticket looks good to me (https://issues.apache.org/
jira/browse/IGNITE-6003).

--Yakov
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Dmitriy Pavlov
Hi Igniters,

Up this discussion: there is still some thing to do in context of 32 bit
VMs, see issue https://issues.apache.org/jira/browse/IGNITE-5618

Denis M. suggested solution for this:  let's make the default size
calculation more sophisticated. If Ignite is running in a 32 process and
20% of RAM is > 2 GB then let's request 1 GB or 1.5 GB only.

Alex G., Sergey C., Igniters, could you please share your opinion here or
in JIRA?

Sincerely,
Dmitriy Pavlov

ср, 9 авг. 2017 г. в 14:43, Yakov Zhdanov <[hidden email]>:

> Sergey, ticket looks good to me (https://issues.apache.org/
> jira/browse/IGNITE-6003).
>
> --Yakov
>
Reply | Threaded
Open this post in threaded view
|

Re: [IGNITE-5717] improvements of MemoryPolicy default size

Alexey Goncharuk
I'm ok with using 1GB only on a 32-bit system. Note, though, that there is
no reliable way to detect this, so this will be a best-effort change.

2017-10-31 14:07 GMT+03:00 Dmitry Pavlov <[hidden email]>:

> Hi Igniters,
>
> Up this discussion: there is still some thing to do in context of 32 bit
> VMs, see issue https://issues.apache.org/jira/browse/IGNITE-5618
>
> Denis M. suggested solution for this:  let's make the default size
> calculation more sophisticated. If Ignite is running in a 32 process and
> 20% of RAM is > 2 GB then let's request 1 GB or 1.5 GB only.
>
> Alex G., Sergey C., Igniters, could you please share your opinion here or
> in JIRA?
>
> Sincerely,
> Dmitriy Pavlov
>
> ср, 9 авг. 2017 г. в 14:43, Yakov Zhdanov <[hidden email]>:
>
> > Sergey, ticket looks good to me (https://issues.apache.org/
> > jira/browse/IGNITE-6003).
> >
> > --Yakov
> >
>
12