Re: Issue with replicated cache

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: Issue with replicated cache

dmagda
Let me loop in the Ignite dev list as long as I've not heard about such an
issue. Personally, don't see any misconfiguration in your Ignite config.

-
Denis


On Thu, Dec 26, 2019 at 10:17 AM Prasad Bhalerao <
[hidden email]> wrote:

> I used cache.remove(key) method to delete an entry from cache.
>
> Basically I  was not getting the consistent result on subsequent  API
> calls with the same input.
>
> So I used grid gain console to query the cache. I executed the SQL on
> single node at a time.
> While doing this I found data only on node n1. But same entry was not
> present on nodes n2,n3,n4.
>
> Thanks,
> Prasad
>
>
>
>
> On Thu 26 Dec, 2019, 11:09 PM Denis Magda <[hidden email] wrote:
>
>> Hello Prasad,
>>
>> What APIs did you use to remove the entry from the cache and what method
>> did you use to confirm that the entry still exists on some of the nodes?
>>
>> -
>> Denis
>>
>>
>> On Thu, Dec 26, 2019 at 8:54 AM Prasad Bhalerao <
>> [hidden email]> wrote:
>>
>>> Hi,
>>>
>>> I am using ignite 2.6.0 version and the time out settings are as follows.
>>>
>>> IgniteConfiguration cfg = new IgniteConfiguration();
>>> cfg.setFailureDetectionTimeout(120000);
>>> cfg.setNetworkTimeout(10000);
>>> cfg.setClientFailureDetectionTimeout(120000);
>>>
>>> I have 4 server nodes (n1,n2,n3,n4) and 6 client nodes. I am using a
>>> replicated cache and cache configuration is as shown below.
>>> As you can see write-through is false, read through is true and write
>>> synchronization mode is FULL_SYNC.
>>>
>>> I got an issue, a network entry was removed from network cache but some
>>> how it was removed from only 3 server nodes (n2,n3,n4). I was able to see
>>> the network entry on node n1 consistently for a day(when it was removed).
>>> So I checked the logs for any errors/warnings but I could not find any.
>>> I did not see any segmentation issue in logs, looked like cluster was in
>>> healthy state.
>>> When I checked the cache after 2 days, I could not find that entry.
>>> Cache was in a state as it was supposed to be.  Servers were  not stopped
>>> and restarted during this whole time.
>>>
>>> Some how I am not able to reproduce this issue on dev env.
>>>
>>> Is there any way to investigate/debug this issue? Can someone please
>>> advise?
>>>
>>> private CacheConfiguration networkCacheCfg() {
>>>   CacheConfiguration networkCacheCfg = new CacheConfiguration<>(CacheName.NETWORK_CACHE.name());
>>>   networkCacheCfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
>>>   networkCacheCfg.setWriteThrough(false);
>>>   networkCacheCfg.setReadThrough(true);
>>>   networkCacheCfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
>>>   networkCacheCfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
>>>   networkCacheCfg.setBackups(this.backupCount);
>>>   networkCacheCfg.setCacheMode(CacheMode.REPLICATED);
>>>   Factory<NetworkDataCacheLoader> storeFactory = FactoryBuilder.factoryOf(NetworkDataCacheLoader.class);
>>>   networkCacheCfg.setCacheStoreFactory(storeFactory);
>>>   networkCacheCfg.setIndexedTypes(DefaultDataAffinityKey.class, NetworkData.class);
>>>   networkCacheCfg.setSqlIndexMaxInlineSize(65);
>>>   RendezvousAffinityFunction affinityFunction = new RendezvousAffinityFunction();
>>>   affinityFunction.setExcludeNeighbors(true);
>>>   networkCacheCfg.setAffinity(affinityFunction);
>>>   networkCacheCfg.setStatisticsEnabled(true);
>>>
>>>   return networkCacheCfg;
>>> }
>>>
>>>
>>>
>>> Thanks,
>>> PRasad
>>>
>>>
Reply | Threaded
Open this post in threaded view
|

Re: Issue with replicated cache

ezhuravl
Hi Prasad,

Can you please share logs from all nodes, so I can check what was happening
with a cluster before an incident? It would be great to see logs since
nodes start.

Thanks,
Evgenii

чт, 26 дек. 2019 г. в 11:42, Denis Magda <[hidden email]>:

> Let me loop in the Ignite dev list as long as I've not heard about such an
> issue. Personally, don't see any misconfiguration in your Ignite config.
>
> -
> Denis
>
>
> On Thu, Dec 26, 2019 at 10:17 AM Prasad Bhalerao <
> [hidden email]> wrote:
>
> > I used cache.remove(key) method to delete an entry from cache.
> >
> > Basically I  was not getting the consistent result on subsequent  API
> > calls with the same input.
> >
> > So I used grid gain console to query the cache. I executed the SQL on
> > single node at a time.
> > While doing this I found data only on node n1. But same entry was not
> > present on nodes n2,n3,n4.
> >
> > Thanks,
> > Prasad
> >
> >
> >
> >
> > On Thu 26 Dec, 2019, 11:09 PM Denis Magda <[hidden email] wrote:
> >
> >> Hello Prasad,
> >>
> >> What APIs did you use to remove the entry from the cache and what method
> >> did you use to confirm that the entry still exists on some of the nodes?
> >>
> >> -
> >> Denis
> >>
> >>
> >> On Thu, Dec 26, 2019 at 8:54 AM Prasad Bhalerao <
> >> [hidden email]> wrote:
> >>
> >>> Hi,
> >>>
> >>> I am using ignite 2.6.0 version and the time out settings are as
> follows.
> >>>
> >>> IgniteConfiguration cfg = new IgniteConfiguration();
> >>> cfg.setFailureDetectionTimeout(120000);
> >>> cfg.setNetworkTimeout(10000);
> >>> cfg.setClientFailureDetectionTimeout(120000);
> >>>
> >>> I have 4 server nodes (n1,n2,n3,n4) and 6 client nodes. I am using a
> >>> replicated cache and cache configuration is as shown below.
> >>> As you can see write-through is false, read through is true and write
> >>> synchronization mode is FULL_SYNC.
> >>>
> >>> I got an issue, a network entry was removed from network cache but some
> >>> how it was removed from only 3 server nodes (n2,n3,n4). I was able to
> see
> >>> the network entry on node n1 consistently for a day(when it was
> removed).
> >>> So I checked the logs for any errors/warnings but I could not find any.
> >>> I did not see any segmentation issue in logs, looked like cluster was
> in
> >>> healthy state.
> >>> When I checked the cache after 2 days, I could not find that entry.
> >>> Cache was in a state as it was supposed to be.  Servers were  not
> stopped
> >>> and restarted during this whole time.
> >>>
> >>> Some how I am not able to reproduce this issue on dev env.
> >>>
> >>> Is there any way to investigate/debug this issue? Can someone please
> >>> advise?
> >>>
> >>> private CacheConfiguration networkCacheCfg() {
> >>>   CacheConfiguration networkCacheCfg = new CacheConfiguration<>(
> CacheName.NETWORK_CACHE.name());
> >>>   networkCacheCfg.setAtomicityMode(CacheAtomicityMode.ATOMIC);
> >>>   networkCacheCfg.setWriteThrough(false);
> >>>   networkCacheCfg.setReadThrough(true);
> >>>   networkCacheCfg.setRebalanceMode(CacheRebalanceMode.ASYNC);
> >>>
>  networkCacheCfg.setWriteSynchronizationMode(CacheWriteSynchronizationMode.FULL_SYNC);
> >>>   networkCacheCfg.setBackups(this.backupCount);
> >>>   networkCacheCfg.setCacheMode(CacheMode.REPLICATED);
> >>>   Factory<NetworkDataCacheLoader> storeFactory =
> FactoryBuilder.factoryOf(NetworkDataCacheLoader.class);
> >>>   networkCacheCfg.setCacheStoreFactory(storeFactory);
> >>>   networkCacheCfg.setIndexedTypes(DefaultDataAffinityKey.class,
> NetworkData.class);
> >>>   networkCacheCfg.setSqlIndexMaxInlineSize(65);
> >>>   RendezvousAffinityFunction affinityFunction = new
> RendezvousAffinityFunction();
> >>>   affinityFunction.setExcludeNeighbors(true);
> >>>   networkCacheCfg.setAffinity(affinityFunction);
> >>>   networkCacheCfg.setStatisticsEnabled(true);
> >>>
> >>>   return networkCacheCfg;
> >>> }
> >>>
> >>>
> >>>
> >>> Thanks,
> >>> PRasad
> >>>
> >>>
>