Ignite has been deployed on the kubernets , there are 3 replicas of server pod. The pods were up and running fine for 9 days. We have created 180 inventory tables and 204 transactional tables. The data has been inserted using the PyIgnite client using the cache.put() method. This is a very slow operation because PyIgnite is very slow. Each insert is committed one at a time, so it is not able to do bulk-style inserts. The PyIgnite was inserting about 20 of the inventory tables simultaneously (20 different threads/processes). The cluster was nowhere stable after 9days, one of the pod crashed and failed to recover. Below is the error log: {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage [reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null, startCaches=[BgwService]] java.lang.NullPointerException| at org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)| at org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)| at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)| at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)| at org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)| at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| at java.lang.Thread.run(Thread.java:748)"} {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite node stopped in the middle of checkpoint. Will restore memory state and finish checkpoint on node start."}The error report file and ignite-config.xml has been attached for your info. Heap Memory and RAM Configurations are as below on each of the ignite server container: Heap Memory: 32gb RAM: 64GB Default memory region: cpu: 4 Persistence volume wal_storage_size: 10GB persistence_storage_size: 10GB Thanks With Regards Radha |
Hello,
As I see, the community guys stepped in and ready to help with this problem via this discussion: http://apache-ignite-users.70518.x6.nabble.com/One-of-Ignite-pod-keeps-crashing-and-not-joining-the-cluster-td29091.html#a29105 Please check out that response that clarifies why the method you use for data loading is not optimal: https://stackoverflow.com/questions/56778778/apache-ignite-inserts-extremely-slow/56795152#56795152 - Denis On Tue, Aug 20, 2019 at 10:25 AM radha jai <[hidden email]> wrote: > Ignite has been deployed on the kubernets , there are 3 replicas of server > pod. The pods were up and running fine for 9 days. We have created 180 > inventory tables and 204 transactional tables. The data has been > inserted using the PyIgnite client using the cache.put() method. This is a > very slow operation because PyIgnite is very slow. Each insert is > committed one at a time, so it is not able to do bulk-style inserts. The > PyIgnite was inserting about 20 of the inventory tables simultaneously (20 > different threads/processes). > > The cluster was nowhere stable after 9days, one of the pod crashed and > failed to recover. Below is the error log: > {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"ERROR","system":"ignite-service","time":"2019-08-16T17:13:34,769Z","logger":"GridCachePartitionExchangeManager","timezone":"UTC","log":"Failed > to process custom exchange task: ClientCacheChangeDummyDiscoveryMessage > [reqId=6b5f6c50-a8c9-4b04-a461-49bfd0112eb0, cachesToClose=null, > startCaches=[BgwService]] java.lang.NullPointerException| at > org.apache.ignite.internal.processors.cache.CacheAffinitySharedManager.processClientCachesChanges(CacheAffinitySharedManager.java:635)| > at > org.apache.ignite.internal.processors.cache.GridCacheProcessor.processCustomExchangeTask(GridCacheProcessor.java:391)| > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.processCustomTask(GridCachePartitionExchangeManager.java:2475)| > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body0(GridCachePartitionExchangeManager.java:2620)| > at > org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager$ExchangeWorker.body(GridCachePartitionExchangeManager.java:2539)| > at > org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120)| > at java.lang.Thread.run(Thread.java:748)"} > {"type":"log","host":"ignite-cluster-ignite-esoc-2","level":"WARN","system":"ignite-service","time":"2019-08-16T17:13:36,724Z","logger":"GridCacheDatabaseSharedManager","timezone":"UTC","log":"Ignite > node stopped in the middle of checkpoint. Will restore memory state and > finish checkpoint on node start."} > > The error report file and ignite-config.xml has been attached for your > info. > > Heap Memory and RAM Configurations are as below on each of the ignite > server container: > > Heap Memory: 32gb > > RAM: 64GB > > Default memory region: > > cpu: 4 > > Persistence volume > > wal_storage_size: 10GB > > persistence_storage_size: 10GB > > > Thanks > > With Regards > > Radha > |
Free forum by Nabble | Edit this page |