Out of memory with eviction failure on persisted cache

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Out of memory with eviction failure on persisted cache

Raymond Wilson
I've been having a sporadic issue with the Ignite 2.7.5 JVM halting due to
out of memory error related to a cache with persistence enabled

I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support for
C# affinity functions and now have this issue appearing regularly while
adding around 400Mb of data into the cache which is configured to have
128Mb of memory (this was 64Mb but I increased it to see if the failure
would resolve.

The error I get is:

2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will be
halted immediately due to the failure: [failureCtx=FailureContext
[type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
failedToPrepare=675]
Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0 MiB,
maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
  ^-- Increase maximum off-heap memory size
(DataRegionConfiguration.maxSize)
  ^-- Enable Ignite persistence (DataRegionConfiguration.persistenceEnabled)
  ^-- Enable eviction or expiration policies]]

I'm not running an eviction policy as I thought this was not required for
caches with persistence enabled.

I'm surprised by this behaviour as I expected the persistence mechanism to
handle it. The error relating to failure to find a page for eviction
suggest the persistence mechanism has fallen behind. If this is the case,
this seems like an unfriendly failure mode.

Thanks,
Raymond.
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

Raymond Wilson
To add some further detail:

There are two processes interacting with the cache. One process is writing
data into the cache, while the second process is extracting data from the
cache using a continuous query. The process that is the reader of the data
is throwing the exception.

Increasing the cache size further to 256 Mb resolves the problem for this
data set, however we have data sets more than 100 times this size which we
will be processing.

Thanks,
Raymond.


On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <[hidden email]>
wrote:

> I've been having a sporadic issue with the Ignite 2.7.5 JVM halting due to
> out of memory error related to a cache with persistence enabled
>
> I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support for
> C# affinity functions and now have this issue appearing regularly while
> adding around 400Mb of data into the cache which is configured to have
> 128Mb of memory (this was 64Mb but I increased it to see if the failure
> would resolve.
>
> The error I get is:
>
> 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will be
> halted immediately due to the failure: [failureCtx=FailureContext
> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
> Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
> maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
> failedToPrepare=675]
> Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0 MiB,
> maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>   ^-- Increase maximum off-heap memory size
> (DataRegionConfiguration.maxSize)
>   ^-- Enable Ignite persistence
> (DataRegionConfiguration.persistenceEnabled)
>   ^-- Enable eviction or expiration policies]]
>
> I'm not running an eviction policy as I thought this was not required for
> caches with persistence enabled.
>
> I'm surprised by this behaviour as I expected the persistence mechanism to
> handle it. The error relating to failure to find a page for eviction
> suggest the persistence mechanism has fallen behind. If this is the case,
> this seems like an unfriendly failure mode.
>
> Thanks,
> Raymond.
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

ezhuravl
Hi,

How are you loading the data? Do you use putAll or DataStreamer?

Evgenii

ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]>:

> To add some further detail:
>
> There are two processes interacting with the cache. One process is writing
> data into the cache, while the second process is extracting data from the
> cache using a continuous query. The process that is the reader of the data
> is throwing the exception.
>
> Increasing the cache size further to 256 Mb resolves the problem for this
> data set, however we have data sets more than 100 times this size which we
> will be processing.
>
> Thanks,
> Raymond.
>
>
> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <[hidden email]
> >
> wrote:
>
> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting due
> to
> > out of memory error related to a cache with persistence enabled
> >
> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support for
> > C# affinity functions and now have this issue appearing regularly while
> > adding around 400Mb of data into the cache which is configured to have
> > 128Mb of memory (this was 64Mb but I increased it to see if the failure
> > would resolve.
> >
> > The error I get is:
> >
> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will be
> > halted immediately due to the failure: [failureCtx=FailureContext
> > [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
> > Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
> > failedToPrepare=675]
> > Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0
> MiB,
> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
> >   ^-- Increase maximum off-heap memory size
> > (DataRegionConfiguration.maxSize)
> >   ^-- Enable Ignite persistence
> > (DataRegionConfiguration.persistenceEnabled)
> >   ^-- Enable eviction or expiration policies]]
> >
> > I'm not running an eviction policy as I thought this was not required for
> > caches with persistence enabled.
> >
> > I'm surprised by this behaviour as I expected the persistence mechanism
> to
> > handle it. The error relating to failure to find a page for eviction
> > suggest the persistence mechanism has fallen behind. If this is the case,
> > this seems like an unfriendly failure mode.
> >
> > Thanks,
> > Raymond.
> >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

Raymond Wilson
Hi Evgenii,

I am individually Put()ing the elements using PutIfAbsent(). Each element
can range 2kb-35Kb in size.

Actually, the process that writes the data does not write the data directly
to the cache, it uses a compute function to send the payload to the process
that is doing the reading. The compute function applies validation logic
and uses PutIfAbsent() to write the data into the cache.

Sorry for the confusion.

Raymond.


On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <[hidden email]>
wrote:

> Hi,
>
> How are you loading the data? Do you use putAll or DataStreamer?
>
> Evgenii
>
> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]>:
>
>> To add some further detail:
>>
>> There are two processes interacting with the cache. One process is writing
>> data into the cache, while the second process is extracting data from the
>> cache using a continuous query. The process that is the reader of the data
>> is throwing the exception.
>>
>> Increasing the cache size further to 256 Mb resolves the problem for this
>> data set, however we have data sets more than 100 times this size which we
>> will be processing.
>>
>> Thanks,
>> Raymond.
>>
>>
>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>> [hidden email]>
>> wrote:
>>
>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting due
>> to
>> > out of memory error related to a cache with persistence enabled
>> >
>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support
>> for
>> > C# affinity functions and now have this issue appearing regularly while
>> > adding around 400Mb of data into the cache which is configured to have
>> > 128Mb of memory (this was 64Mb but I increased it to see if the failure
>> > would resolve.
>> >
>> > The error I get is:
>> >
>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will
>> be
>> > halted immediately due to the failure: [failureCtx=FailureContext
>> > [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
>> > Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>> > failedToPrepare=675]
>> > Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0
>> MiB,
>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>> >   ^-- Increase maximum off-heap memory size
>> > (DataRegionConfiguration.maxSize)
>> >   ^-- Enable Ignite persistence
>> > (DataRegionConfiguration.persistenceEnabled)
>> >   ^-- Enable eviction or expiration policies]]
>> >
>> > I'm not running an eviction policy as I thought this was not required
>> for
>> > caches with persistence enabled.
>> >
>> > I'm surprised by this behaviour as I expected the persistence mechanism
>> to
>> > handle it. The error relating to failure to find a page for eviction
>> > suggest the persistence mechanism has fallen behind. If this is the
>> case,
>> > this seems like an unfriendly failure mode.
>> >
>> > Thanks,
>> > Raymond.
>> >
>> >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

ezhuravl
Hi Raymond,

I tried to reproduce it, but without success. Can you share the reproducer?

Also, have you tried to load much more data with 256mb data region? I think
it should work without issues.

Thanks,
Evgenii

ср, 4 мар. 2020 г. в 16:14, Raymond Wilson <[hidden email]>:

> Hi Evgenii,
>
> I am individually Put()ing the elements using PutIfAbsent(). Each element
> can range 2kb-35Kb in size.
>
> Actually, the process that writes the data does not write the data
> directly to the cache, it uses a compute function to send the payload to
> the process that is doing the reading. The compute function applies
> validation logic and uses PutIfAbsent() to write the data into the cache.
>
> Sorry for the confusion.
>
> Raymond.
>
>
> On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <[hidden email]>
> wrote:
>
>> Hi,
>>
>> How are you loading the data? Do you use putAll or DataStreamer?
>>
>> Evgenii
>>
>> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]>:
>>
>>> To add some further detail:
>>>
>>> There are two processes interacting with the cache. One process is
>>> writing
>>> data into the cache, while the second process is extracting data from the
>>> cache using a continuous query. The process that is the reader of the
>>> data
>>> is throwing the exception.
>>>
>>> Increasing the cache size further to 256 Mb resolves the problem for this
>>> data set, however we have data sets more than 100 times this size which
>>> we
>>> will be processing.
>>>
>>> Thanks,
>>> Raymond.
>>>
>>>
>>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>>> [hidden email]>
>>> wrote:
>>>
>>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting
>>> due to
>>> > out of memory error related to a cache with persistence enabled
>>> >
>>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support
>>> for
>>> > C# affinity functions and now have this issue appearing regularly while
>>> > adding around 400Mb of data into the cache which is configured to have
>>> > 128Mb of memory (this was 64Mb but I increased it to see if the failure
>>> > would resolve.
>>> >
>>> > The error I get is:
>>> >
>>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will
>>> be
>>> > halted immediately due to the failure: [failureCtx=FailureContext
>>> > [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
>>> > Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
>>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>>> > failedToPrepare=675]
>>> > Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0
>>> MiB,
>>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>>> >   ^-- Increase maximum off-heap memory size
>>> > (DataRegionConfiguration.maxSize)
>>> >   ^-- Enable Ignite persistence
>>> > (DataRegionConfiguration.persistenceEnabled)
>>> >   ^-- Enable eviction or expiration policies]]
>>> >
>>> > I'm not running an eviction policy as I thought this was not required
>>> for
>>> > caches with persistence enabled.
>>> >
>>> > I'm surprised by this behaviour as I expected the persistence
>>> mechanism to
>>> > handle it. The error relating to failure to find a page for eviction
>>> > suggest the persistence mechanism has fallen behind. If this is the
>>> case,
>>> > this seems like an unfriendly failure mode.
>>> >
>>> > Thanks,
>>> > Raymond.
>>> >
>>> >
>>> >
>>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

Raymond Wilson
The reproducer is my development system, which is hard to share.

I have increased the size of the buffer to 256Mb, and it copes with the
example data load, though I have not tried larger data sets.

From an analytical perspective, is this an error that is possible or
expected to occur when using a cache with a persistent data region defined?

I'll see if I can make a small reproducer.

On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <[hidden email]>
wrote:

> Hi Raymond,
>
> I tried to reproduce it, but without success. Can you share the reproducer?
>
> Also, have you tried to load much more data with 256mb data region? I
> think it should work without issues.
>
> Thanks,
> Evgenii
>
> ср, 4 мар. 2020 г. в 16:14, Raymond Wilson <[hidden email]>:
>
>> Hi Evgenii,
>>
>> I am individually Put()ing the elements using PutIfAbsent(). Each element
>> can range 2kb-35Kb in size.
>>
>> Actually, the process that writes the data does not write the data
>> directly to the cache, it uses a compute function to send the payload to
>> the process that is doing the reading. The compute function applies
>> validation logic and uses PutIfAbsent() to write the data into the cache.
>>
>> Sorry for the confusion.
>>
>> Raymond.
>>
>>
>> On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <
>> [hidden email]> wrote:
>>
>>> Hi,
>>>
>>> How are you loading the data? Do you use putAll or DataStreamer?
>>>
>>> Evgenii
>>>
>>> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]>:
>>>
>>>> To add some further detail:
>>>>
>>>> There are two processes interacting with the cache. One process is
>>>> writing
>>>> data into the cache, while the second process is extracting data from
>>>> the
>>>> cache using a continuous query. The process that is the reader of the
>>>> data
>>>> is throwing the exception.
>>>>
>>>> Increasing the cache size further to 256 Mb resolves the problem for
>>>> this
>>>> data set, however we have data sets more than 100 times this size which
>>>> we
>>>> will be processing.
>>>>
>>>> Thanks,
>>>> Raymond.
>>>>
>>>>
>>>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>>>> [hidden email]>
>>>> wrote:
>>>>
>>>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting
>>>> due to
>>>> > out of memory error related to a cache with persistence enabled
>>>> >
>>>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support
>>>> for
>>>> > C# affinity functions and now have this issue appearing regularly
>>>> while
>>>> > adding around 400Mb of data into the cache which is configured to have
>>>> > 128Mb of memory (this was 64Mb but I increased it to see if the
>>>> failure
>>>> > would resolve.
>>>> >
>>>> > The error I get is:
>>>> >
>>>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM
>>>> will be
>>>> > halted immediately due to the failure: [failureCtx=FailureContext
>>>> > [type=CRITICAL_ERROR, err=class
>>>> o.a.i.i.mem.IgniteOutOfMemoryException:
>>>> > Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
>>>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>>>> > failedToPrepare=675]
>>>> > Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0
>>>> MiB,
>>>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>>>> >   ^-- Increase maximum off-heap memory size
>>>> > (DataRegionConfiguration.maxSize)
>>>> >   ^-- Enable Ignite persistence
>>>> > (DataRegionConfiguration.persistenceEnabled)
>>>> >   ^-- Enable eviction or expiration policies]]
>>>> >
>>>> > I'm not running an eviction policy as I thought this was not required
>>>> for
>>>> > caches with persistence enabled.
>>>> >
>>>> > I'm surprised by this behaviour as I expected the persistence
>>>> mechanism to
>>>> > handle it. The error relating to failure to find a page for eviction
>>>> > suggest the persistence mechanism has fallen behind. If this is the
>>>> case,
>>>> > this seems like an unfriendly failure mode.
>>>> >
>>>> > Thanks,
>>>> > Raymond.
>>>> >
>>>> >
>>>> >
>>>>
>>>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

Raymond Wilson
Evgenii,

I have created a reproducer that triggers the error with the buffer size set to 64Mb. The program.cs/csproj and log for the run that triggered the error are attached.

Thanks,
Raymond.



On Fri, Mar 6, 2020 at 1:08 PM Raymond Wilson <[hidden email]> wrote:
The reproducer is my development system, which is hard to share.

I have increased the size of the buffer to 256Mb, and it copes with the example data load, though I have not tried larger data sets.

From an analytical perspective, is this an error that is possible or expected to occur when using a cache with a persistent data region defined?

I'll see if I can make a small reproducer.

On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <[hidden email]> wrote:
Hi Raymond,

I tried to reproduce it, but without success. Can you share the reproducer?

Also, have you tried to load much more data with 256mb data region? I think it should work without issues.

Thanks,
Evgenii

ср, 4 мар. 2020 г. в 16:14, Raymond Wilson <[hidden email]>:
Hi Evgenii,

I am individually Put()ing the elements using PutIfAbsent(). Each element can range 2kb-35Kb in size.

Actually, the process that writes the data does not write the data directly to the cache, it uses a compute function to send the payload to the process that is doing the reading. The compute function applies validation logic and uses PutIfAbsent() to write the data into the cache.

Sorry for the confusion.

Raymond.


On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <[hidden email]> wrote:
Hi,

How are you loading the data? Do you use putAll or DataStreamer?

Evgenii

ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]>:
To add some further detail:

There are two processes interacting with the cache. One process is writing
data into the cache, while the second process is extracting data from the
cache using a continuous query. The process that is the reader of the data
is throwing the exception.

Increasing the cache size further to 256 Mb resolves the problem for this
data set, however we have data sets more than 100 times this size which we
will be processing.

Thanks,
Raymond.


On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <[hidden email]>
wrote:

> I've been having a sporadic issue with the Ignite 2.7.5 JVM halting due to
> out of memory error related to a cache with persistence enabled
>
> I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up support for
> C# affinity functions and now have this issue appearing regularly while
> adding around 400Mb of data into the cache which is configured to have
> 128Mb of memory (this was 64Mb but I increased it to see if the failure
> would resolve.
>
> The error I get is:
>
> 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM will be
> halted immediately due to the failure: [failureCtx=FailureContext
> [type=CRITICAL_ERROR, err=class o.a.i.i.mem.IgniteOutOfMemoryException:
> Failed to find a page for eviction [segmentCapacity=1700, loaded=676,
> maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
> failedToPrepare=675]
> Out of memory in data region [name=TAGFileBufferQueue, initSize=128.0 MiB,
> maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>   ^-- Increase maximum off-heap memory size
> (DataRegionConfiguration.maxSize)
>   ^-- Enable Ignite persistence
> (DataRegionConfiguration.persistenceEnabled)
>   ^-- Enable eviction or expiration policies]]
>
> I'm not running an eviction policy as I thought this was not required for
> caches with persistence enabled.
>
> I'm surprised by this behaviour as I expected the persistence mechanism to
> handle it. The error relating to failure to find a page for eviction
> suggest the persistence mechanism has fallen behind. If this is the case,
> this seems like an unfriendly failure mode.
>
> Thanks,
> Raymond.
>
>
>

Program.cs (8K) Download Attachment
OutOfMemoryReproducer.csproj (380 bytes) Download Attachment
ReproducerLog.txt (420K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

Raymond Wilson
Evgenii,

Have you had a chance to look into the reproducer?

Thanks,
Raymond.

On Fri, Mar 6, 2020 at 2:51 PM Raymond Wilson <[hidden email]>
wrote:

> Evgenii,
>
> I have created a reproducer that triggers the error with the buffer size
> set to 64Mb. The program.cs/csproj and log for the run that triggered the
> error are attached.
>
> Thanks,
> Raymond.
>
>
>
> On Fri, Mar 6, 2020 at 1:08 PM Raymond Wilson <[hidden email]>
> wrote:
>
>> The reproducer is my development system, which is hard to share.
>>
>> I have increased the size of the buffer to 256Mb, and it copes with the
>> example data load, though I have not tried larger data sets.
>>
>> From an analytical perspective, is this an error that is possible or
>> expected to occur when using a cache with a persistent data region defined?
>>
>> I'll see if I can make a small reproducer.
>>
>> On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <
>> [hidden email]> wrote:
>>
>>> Hi Raymond,
>>>
>>> I tried to reproduce it, but without success. Can you share the
>>> reproducer?
>>>
>>> Also, have you tried to load much more data with 256mb data region? I
>>> think it should work without issues.
>>>
>>> Thanks,
>>> Evgenii
>>>
>>> ср, 4 мар. 2020 г. в 16:14, Raymond Wilson <[hidden email]>:
>>>
>>>> Hi Evgenii,
>>>>
>>>> I am individually Put()ing the elements using PutIfAbsent(). Each
>>>> element can range 2kb-35Kb in size.
>>>>
>>>> Actually, the process that writes the data does not write the data
>>>> directly to the cache, it uses a compute function to send the payload to
>>>> the process that is doing the reading. The compute function applies
>>>> validation logic and uses PutIfAbsent() to write the data into the cache.
>>>>
>>>> Sorry for the confusion.
>>>>
>>>> Raymond.
>>>>
>>>>
>>>> On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <
>>>> [hidden email]> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> How are you loading the data? Do you use putAll or DataStreamer?
>>>>>
>>>>> Evgenii
>>>>>
>>>>> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <[hidden email]
>>>>> >:
>>>>>
>>>>>> To add some further detail:
>>>>>>
>>>>>> There are two processes interacting with the cache. One process is
>>>>>> writing
>>>>>> data into the cache, while the second process is extracting data from
>>>>>> the
>>>>>> cache using a continuous query. The process that is the reader of the
>>>>>> data
>>>>>> is throwing the exception.
>>>>>>
>>>>>> Increasing the cache size further to 256 Mb resolves the problem for
>>>>>> this
>>>>>> data set, however we have data sets more than 100 times this size
>>>>>> which we
>>>>>> will be processing.
>>>>>>
>>>>>> Thanks,
>>>>>> Raymond.
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>>>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM halting
>>>>>> due to
>>>>>> > out of memory error related to a cache with persistence enabled
>>>>>> >
>>>>>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up
>>>>>> support for
>>>>>> > C# affinity functions and now have this issue appearing regularly
>>>>>> while
>>>>>> > adding around 400Mb of data into the cache which is configured to
>>>>>> have
>>>>>> > 128Mb of memory (this was 64Mb but I increased it to see if the
>>>>>> failure
>>>>>> > would resolve.
>>>>>> >
>>>>>> > The error I get is:
>>>>>> >
>>>>>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM
>>>>>> will be
>>>>>> > halted immediately due to the failure: [failureCtx=FailureContext
>>>>>> > [type=CRITICAL_ERROR, err=class
>>>>>> o.a.i.i.mem.IgniteOutOfMemoryException:
>>>>>> > Failed to find a page for eviction [segmentCapacity=1700,
>>>>>> loaded=676,
>>>>>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>>>>>> > failedToPrepare=675]
>>>>>> > Out of memory in data region [name=TAGFileBufferQueue,
>>>>>> initSize=128.0 MiB,
>>>>>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>>>>>> >   ^-- Increase maximum off-heap memory size
>>>>>> > (DataRegionConfiguration.maxSize)
>>>>>> >   ^-- Enable Ignite persistence
>>>>>> > (DataRegionConfiguration.persistenceEnabled)
>>>>>> >   ^-- Enable eviction or expiration policies]]
>>>>>> >
>>>>>> > I'm not running an eviction policy as I thought this was not
>>>>>> required for
>>>>>> > caches with persistence enabled.
>>>>>> >
>>>>>> > I'm surprised by this behaviour as I expected the persistence
>>>>>> mechanism to
>>>>>> > handle it. The error relating to failure to find a page for eviction
>>>>>> > suggest the persistence mechanism has fallen behind. If this is the
>>>>>> case,
>>>>>> > this seems like an unfriendly failure mode.
>>>>>> >
>>>>>> > Thanks,
>>>>>> > Raymond.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
Reply | Threaded
Open this post in threaded view
|

Re: Out of memory with eviction failure on persisted cache

ezhuravl
Raymond,

I've seen this behaviour before, it occurs on massive data loading to a
cluster with a small data region. It's not reproducible with data regions
with normal sizes, I think that this is the reason why this issue is not
fixed yet.

Best Regards,
Evgenii

ср, 8 апр. 2020 г. в 04:23, Raymond Wilson <[hidden email]>:

> Evgenii,
>
> Have you had a chance to look into the reproducer?
>
> Thanks,
> Raymond.
>
> On Fri, Mar 6, 2020 at 2:51 PM Raymond Wilson <[hidden email]>
> wrote:
>
>> Evgenii,
>>
>> I have created a reproducer that triggers the error with the buffer size
>> set to 64Mb. The program.cs/csproj and log for the run that triggered the
>> error are attached.
>>
>> Thanks,
>> Raymond.
>>
>>
>>
>> On Fri, Mar 6, 2020 at 1:08 PM Raymond Wilson <[hidden email]>
>> wrote:
>>
>>> The reproducer is my development system, which is hard to share.
>>>
>>> I have increased the size of the buffer to 256Mb, and it copes with the
>>> example data load, though I have not tried larger data sets.
>>>
>>> From an analytical perspective, is this an error that is possible or
>>> expected to occur when using a cache with a persistent data region defined?
>>>
>>> I'll see if I can make a small reproducer.
>>>
>>> On Fri, Mar 6, 2020 at 11:34 AM Evgenii Zhuravlev <
>>> [hidden email]> wrote:
>>>
>>>> Hi Raymond,
>>>>
>>>> I tried to reproduce it, but without success. Can you share the
>>>> reproducer?
>>>>
>>>> Also, have you tried to load much more data with 256mb data region? I
>>>> think it should work without issues.
>>>>
>>>> Thanks,
>>>> Evgenii
>>>>
>>>> ср, 4 мар. 2020 г. в 16:14, Raymond Wilson <[hidden email]
>>>> >:
>>>>
>>>>> Hi Evgenii,
>>>>>
>>>>> I am individually Put()ing the elements using PutIfAbsent(). Each
>>>>> element can range 2kb-35Kb in size.
>>>>>
>>>>> Actually, the process that writes the data does not write the data
>>>>> directly to the cache, it uses a compute function to send the payload to
>>>>> the process that is doing the reading. The compute function applies
>>>>> validation logic and uses PutIfAbsent() to write the data into the cache.
>>>>>
>>>>> Sorry for the confusion.
>>>>>
>>>>> Raymond.
>>>>>
>>>>>
>>>>> On Thu, Mar 5, 2020 at 1:09 PM Evgenii Zhuravlev <
>>>>> [hidden email]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> How are you loading the data? Do you use putAll or DataStreamer?
>>>>>>
>>>>>> Evgenii
>>>>>>
>>>>>> ср, 4 мар. 2020 г. в 15:37, Raymond Wilson <
>>>>>> [hidden email]>:
>>>>>>
>>>>>>> To add some further detail:
>>>>>>>
>>>>>>> There are two processes interacting with the cache. One process is
>>>>>>> writing
>>>>>>> data into the cache, while the second process is extracting data
>>>>>>> from the
>>>>>>> cache using a continuous query. The process that is the reader of
>>>>>>> the data
>>>>>>> is throwing the exception.
>>>>>>>
>>>>>>> Increasing the cache size further to 256 Mb resolves the problem for
>>>>>>> this
>>>>>>> data set, however we have data sets more than 100 times this size
>>>>>>> which we
>>>>>>> will be processing.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Raymond.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 5, 2020 at 12:10 PM Raymond Wilson <
>>>>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> > I've been having a sporadic issue with the Ignite 2.7.5 JVM
>>>>>>> halting due to
>>>>>>> > out of memory error related to a cache with persistence enabled
>>>>>>> >
>>>>>>> > I just upgraded to the C#.Net, Ignite 2.7.6 client to pick up
>>>>>>> support for
>>>>>>> > C# affinity functions and now have this issue appearing regularly
>>>>>>> while
>>>>>>> > adding around 400Mb of data into the cache which is configured to
>>>>>>> have
>>>>>>> > 128Mb of memory (this was 64Mb but I increased it to see if the
>>>>>>> failure
>>>>>>> > would resolve.
>>>>>>> >
>>>>>>> > The error I get is:
>>>>>>> >
>>>>>>> > 2020-03-05 11:58:57,568 [542] ERR [MutableCacheComputeServer] JVM
>>>>>>> will be
>>>>>>> > halted immediately due to the failure: [failureCtx=FailureContext
>>>>>>> > [type=CRITICAL_ERROR, err=class
>>>>>>> o.a.i.i.mem.IgniteOutOfMemoryException:
>>>>>>> > Failed to find a page for eviction [segmentCapacity=1700,
>>>>>>> loaded=676,
>>>>>>> > maxDirtyPages=507, dirtyPages=675, cpPages=0, pinnedInSegment=2,
>>>>>>> > failedToPrepare=675]
>>>>>>> > Out of memory in data region [name=TAGFileBufferQueue,
>>>>>>> initSize=128.0 MiB,
>>>>>>> > maxSize=128.0 MiB, persistenceEnabled=true] Try the following:
>>>>>>> >   ^-- Increase maximum off-heap memory size
>>>>>>> > (DataRegionConfiguration.maxSize)
>>>>>>> >   ^-- Enable Ignite persistence
>>>>>>> > (DataRegionConfiguration.persistenceEnabled)
>>>>>>> >   ^-- Enable eviction or expiration policies]]
>>>>>>> >
>>>>>>> > I'm not running an eviction policy as I thought this was not
>>>>>>> required for
>>>>>>> > caches with persistence enabled.
>>>>>>> >
>>>>>>> > I'm surprised by this behaviour as I expected the persistence
>>>>>>> mechanism to
>>>>>>> > handle it. The error relating to failure to find a page for
>>>>>>> eviction
>>>>>>> > suggest the persistence mechanism has fallen behind. If this is
>>>>>>> the case,
>>>>>>> > this seems like an unfriendly failure mode.
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > Raymond.
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>