Hi all.
We have 2 servers and a cache X. On both servers a method is running reqularly and run a ScanQurey on that cache. We get partitions for that query via ignite.affinity(cacheName).primaryPartitions(ignite.cluster().localNode()) and run the query on each partitions. When cache has been destroyed by master server on second server we get: javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: null at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.query(IgniteCacheProxy.java:740) at com.intellica.evam.engine.event.future.FutureEventWorker.processFutureEvents(FutureEventWorker.java:117) at com.intellica.evam.engine.event.future.FutureEventWorker.run(FutureEventWorker.java:66) Caused by: class org.apache.ignite.IgniteCheckedException: null at org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:1693) at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.query(IgniteCacheProxy.java:494) at org.apache.ignite.internal.processors.cache.IgniteCacheProxy.query(IgniteCacheProxy.java:732) ... 2 more Caused by: java.lang.NullPointerException at org.apache.ignite.internal.processors.cache.query.GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.init(GridCacheQueryAdapter.java:712) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.<init>(GridCacheQueryAdapter.java:677) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.<init>(GridCacheQueryAdapter.java:628) at org.apache.ignite.internal.processors.cache.query.GridCacheQueryAdapter.executeScanQuery(GridCacheQueryAdapter.java:548) at org.apache.ignite.internal.processors.cache.IgniteCacheProxy$2.applyx(IgniteCacheProxy.java:497) at org.apache.ignite.internal.processors.cache.IgniteCacheProxy$2.applyx(IgniteCacheProxy.java:495) at org.apache.ignite.internal.util.lang.IgniteOutClosureX.apply(IgniteOutClosureX.java:36) at org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:1670) ... 4 more for a while until cache is closed on that server too. The corresponding line is: 710: final ClusterNode node = nodes.poll(); 711: 712: if (*node*.isLocal()) { Obviously node is null. nodes is a dequeue fill by following method: private Queue<ClusterNode> fallbacks(AffinityTopologyVersion topVer) { Deque<ClusterNode> fallbacks = new LinkedList<>(); Collection<ClusterNode> owners = new HashSet<>(); for (ClusterNode node : cctx.topology().owners(part, topVer)) { if (node.isLocal()) fallbacks.addFirst(node); else fallbacks.add(node); owners.add(node); } for (ClusterNode node : cctx.topology().moving(part)) { if (!owners.contains(node)) fallbacks.add(node); } return fallbacks; } There errors occurs before cache closed on second server. So checking if cache closed is not enough. Why when we take partitions for local node we get some partitions but ignite cant find any owner for that partition? Is our method for getting partitions wrong? Is there any way to avoid that? Best regards. -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Hi all.
It seems one method uses affinity: cctx.affinity().primaryPartitions(n.id(), topologyVersion()); other uses topology API. cctx.topology().owners(part, topVer) Are these two fully consistent? On Tue, Dec 6, 2016 at 4:58 PM, Alper Tekinalp <[hidden email]> wrote: > Hi all. > > We have 2 servers and a cache X. On both servers a method is running > reqularly and run a ScanQurey on that cache. We get partitions for that > query via > > ignite.affinity(cacheName).primaryPartitions(ignite.cluster().localNode()) > > and run the query on each partitions. When cache has been destroyed by > master server on second server we get: > > javax.cache.CacheException: class org.apache.ignite.IgniteCheckedException: > null > at org.apache.ignite.internal.processors.cache. > IgniteCacheProxy.query(IgniteCacheProxy.java:740) > at com.intellica.evam.engine.event.future.FutureEventWorker. > processFutureEvents(FutureEventWorker.java:117) > at com.intellica.evam.engine.event.future.FutureEventWorker.run( > FutureEventWorker.java:66) > Caused by: class org.apache.ignite.IgniteCheckedException: null > at org.apache.ignite.internal.processors.query.GridQueryProcessor. > executeQuery(GridQueryProcessor.java:1693) > at org.apache.ignite.internal.processors.cache. > IgniteCacheProxy.query(IgniteCacheProxy.java:494) > at org.apache.ignite.internal.processors.cache. > IgniteCacheProxy.query(IgniteCacheProxy.java:732) > ... 2 more > Caused by: java.lang.NullPointerException > at org.apache.ignite.internal.processors.cache.query. > GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.init( > GridCacheQueryAdapter.java:712) > at org.apache.ignite.internal.processors.cache.query. > GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.<init>( > GridCacheQueryAdapter.java:677) > at org.apache.ignite.internal.processors.cache.query. > GridCacheQueryAdapter$ScanQueryFallbackClosableIterator.<init>( > GridCacheQueryAdapter.java:628) > at org.apache.ignite.internal.processors.cache.query. > GridCacheQueryAdapter.executeScanQuery(GridCacheQueryAdapter.java:548) > at org.apache.ignite.internal.processors.cache. > IgniteCacheProxy$2.applyx(IgniteCacheProxy.java:497) > at org.apache.ignite.internal.processors.cache. > IgniteCacheProxy$2.applyx(IgniteCacheProxy.java:495) > at org.apache.ignite.internal.util.lang.IgniteOutClosureX. > apply(IgniteOutClosureX.java:36) > at org.apache.ignite.internal.processors.query.GridQueryProcessor. > executeQuery(GridQueryProcessor.java:1670) > ... 4 more > > for a while until cache is closed on that server too. > > The corresponding line is: > > 710: final ClusterNode node = nodes.poll(); > 711: > 712: if (*node*.isLocal()) { > > Obviously node is null. nodes is a dequeue fill by following method: > > private Queue<ClusterNode> fallbacks(AffinityTopologyVersion > topVer) { > Deque<ClusterNode> fallbacks = new LinkedList<>(); > Collection<ClusterNode> owners = new HashSet<>(); > > for (ClusterNode node : cctx.topology().owners(part, topVer)) { > if (node.isLocal()) > fallbacks.addFirst(node); > else > fallbacks.add(node); > > owners.add(node); > } > > for (ClusterNode node : cctx.topology().moving(part)) { > if (!owners.contains(node)) > fallbacks.add(node); > } > > return fallbacks; > } > > There errors occurs before cache closed on second server. So checking if > cache closed is not enough. > > Why when we take partitions for local node we get some partitions but > ignite cant find any owner for that partition? > Is our method for getting partitions wrong? > Is there any way to avoid that? > > Best regards. > -- > Alper Tekinalp > > Software Developer > Evam Streaming Analytics > > Atatürk Mah. Turgut Özal Bulv. > Gardenya 5 Plaza K:6 Ataşehir > 34758 İSTANBUL > > Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 > www.evam.com.tr > <http://www.evam.com> > -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Hi Andrey.
Did you able to look to the code? Regards. On Thu, Dec 8, 2016 at 10:05 AM, Alper Tekinalp <[hidden email]> wrote: > Hi. > > Could you please share your reproducer example? > > > I added classes to repoduce the error. It also throws cache closed errors > I am ok with it. But others. > > -- > Alper Tekinalp > > Software Developer > Evam Streaming Analytics > > Atatürk Mah. Turgut Özal Bulv. > Gardenya 5 Plaza K:6 Ataşehir > 34758 İSTANBUL > > Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 > www.evam.com.tr > <http://www.evam.com> > -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Hi,
I've looked at your code. First of all you have races in your code. For example your start two threads and destroy caches before thread is finished and it leads to cache closed error. Moreover, you stops application before any thread finished and it leads to topology changing and NPE. The second, I don't understand why do you use threads at all? Usually you should start Ignite cluster, connect to it using client node and run scan query. Starting of several instances of Ignite in one JVM makes sense only for test purposes. Anyway, you should start Ignite instances, creates caches and after it run threads with your tasks. On Mon, Dec 12, 2016 at 11:23 AM, Alper Tekinalp <[hidden email]> wrote: > Hi Andrey. > > Did you able to look to the code? > > Regards. > > On Thu, Dec 8, 2016 at 10:05 AM, Alper Tekinalp <[hidden email]> wrote: > >> Hi. >> >> Could you please share your reproducer example? >> >> >> I added classes to repoduce the error. It also throws cache closed errors >> I am ok with it. But others. >> >> -- >> Alper Tekinalp >> >> Software Developer >> Evam Streaming Analytics >> >> Atatürk Mah. Turgut Özal Bulv. >> Gardenya 5 Plaza K:6 Ataşehir >> 34758 İSTANBUL >> >> Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 >> www.evam.com.tr >> <http://www.evam.com> >> > > > > -- > Alper Tekinalp > > Software Developer > Evam Streaming Analytics > > Atatürk Mah. Turgut Özal Bulv. > Gardenya 5 Plaza K:6 Ataşehir > 34758 İSTANBUL > > Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 > www.evam.com.tr > <http://www.evam.com> |
>
> Hi, > Hi Andrey. First of all thanks for your responces. First of all you have races in your code. For example your start two > threads and destroy caches before thread is finished and it leads to > cache closed error. It is of course reasonable to get "cache closed" errors in that case and I am OK with that. > Moreover, you stops application before any thread > finished and it leads to topology changing and NPE. > Correct me if I am wrong but I dont think that is true. NPE occurs before stopping the application. Hence, before any topology changes. I can understand that this error occurs in a topology change but in that case it is not. > The second, I don't understand why do you use threads at all? Usually > you should start Ignite cluster, connect to it using client node and > run scan query. > > Starting of several instances of Ignite in one JVM makes sense only > for test purposes. Anyway, you should start Ignite instances, creates > caches and after it run threads with your tasks. Actually this is a test code I try to simulate our real use case. In our application we have two server nodes that internally running in our application and we run scan query in that server nodes. In some cases both of the servers destroy cache and I expect to get the cache closed errors. But not NPEs because there is no change in topology. Again thanks for your responses. Regards. -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Hi all.
Is there any comment on this? -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Hi,
As I wrote already I don't see any race for stable topology. Problems are possible on unstable topology. You can try wait for topology version will be the same on all nodes in cluster and avoid this race. Unfortunately I can't see your code except of example that has some drawbacks mentioned in my previous reply. On Tue, Dec 20, 2016 at 11:37 AM, Alper Tekinalp <[hidden email]> wrote: > Hi all. > > Is there any comment on this? > > -- > Alper Tekinalp > > Software Developer > Evam Streaming Analytics > > Atatürk Mah. Turgut Özal Bulv. > Gardenya 5 Plaza K:6 Ataşehir > 34758 İSTANBUL > > Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 > www.evam.com.tr |
Hi Andrey,
Do you have an idea what is causing the NPE in the first place? What is null? I didn't quite get it from the thread. -Val On Tue, Dec 20, 2016 at 2:14 AM, Andrey Gura <[hidden email]> wrote: > Hi, > > As I wrote already I don't see any race for stable topology. Problems > are possible on unstable topology. You can try wait for topology > version will be the same on all nodes in cluster and avoid this race. > > Unfortunately I can't see your code except of example that has some > drawbacks mentioned in my previous reply. > > On Tue, Dec 20, 2016 at 11:37 AM, Alper Tekinalp <[hidden email]> wrote: > > Hi all. > > > > Is there any comment on this? > > > > -- > > Alper Tekinalp > > > > Software Developer > > Evam Streaming Analytics > > > > Atatürk Mah. Turgut Özal Bulv. > > Gardenya 5 Plaza K:6 Ataşehir > > 34758 İSTANBUL > > > > Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 > > www.evam.com.tr > |
Hi Val.
Do you have an idea what is causing the NPE in the first place? As Andrey says the exception is caused because of different topology versions cluster nodes. > What is null? > Node is null in the following code in org.apache.ignite.internal.processors.cache.query.ScanQueryFallbackClosableIterator.init(): 710: final ClusterNode node = nodes.poll(); 711: 712: if (*node*.isLocal()) { and nodes found as: private Queue<ClusterNode> fallbacks(AffinityTopologyVersion topVer) { Deque<ClusterNode> fallbacks = new LinkedList<>(); Collection<ClusterNode> owners = new HashSet<>(); for (ClusterNode node : cctx.topology().owners(part, topVer)) { if (node.isLocal()) fallbacks.addFirst(node); else fallbacks.add(node); owners.add(node); } for (ClusterNode node : cctx.topology().moving(part)) { if (!owners.contains(node)) fallbacks.add(node); } return fallbacks; } It is clear that if topology versions are different NPE is expected. But my claim is NPE can occur without any change in topology versions. In the example/dummy code NPE happens before any change in topology versions on cluster. But I could not convince him about that :) Or I am missing somthing really badly. Thanks all of you. Regards. -- Alper Tekinalp Software Developer Evam Streaming Analytics Atatürk Mah. Turgut Özal Bulv. Gardenya 5 Plaza K:6 Ataşehir 34758 İSTANBUL Tel: +90 216 455 01 53 Fax: +90 216 455 01 54 www.evam.com.tr <http://www.evam.com> |
Free forum by Nabble | Edit this page |