In a partitioned cache (or set of partitioned caches) - does a single node
failure mean all of the cache(s) become unavailable? I am seeing a situation where I cannot access any of the caches (using getOrCreateCache) - all my code just "hangs". The interesting thing is that visor can see all the caches and their contents. What is so special about visor? I would appreciate if someone would try and answer any of these (I can provide more info). as I am evaluating ignite for our use in a data science/analytics setup :-) Thanks! Ognen |
Can you please file a ticket and share your sample applicaiton with us?
If it is not possible, then attach verbose logs from all the nodes and threaddumps from all the nodes after issue gets reproduced. Thanks! --Yakov 2015-05-12 15:30 GMT+03:00 Ognen Duzlevski <[hidden email]>: > In a partitioned cache (or set of partitioned caches) - does a single node > failure mean all of the cache(s) become unavailable? > > I am seeing a situation where I cannot access any of the caches (using > getOrCreateCache) - all my code just "hangs". > > The interesting thing is that visor can see all the caches and their > contents. > > What is so special about visor? > > I would appreciate if someone would try and answer any of these (I can > provide more info). as I am evaluating ignite for our use in a data > science/analytics setup :-) > > Thanks! > Ognen > |
Ognen,
It sounds to me like this is the same issue you had recently with the cloud node crashing due to hardware failure. If this is the case, then it sounds like a firewall issue for me. Are you sure there is no firewall setup between nodes and they are all deployed in the same availability zone? D. On Tue, May 12, 2015 at 1:33 PM, Yakov Zhdanov <[hidden email]> wrote: > Can you please file a ticket and share your sample applicaiton with us? > > If it is not possible, then attach verbose logs from all the nodes and > threaddumps from all the nodes after issue gets reproduced. > > Thanks! > > --Yakov > > 2015-05-12 15:30 GMT+03:00 Ognen Duzlevski <[hidden email]>: > > > In a partitioned cache (or set of partitioned caches) - does a single > node > > failure mean all of the cache(s) become unavailable? > > > > I am seeing a situation where I cannot access any of the caches (using > > getOrCreateCache) - all my code just "hangs". > > > > The interesting thing is that visor can see all the caches and their > > contents. > > > > What is so special about visor? > > > > I would appreciate if someone would try and answer any of these (I can > > provide more info). as I am evaluating ignite for our use in a data > > science/analytics setup :-) > > > > Thanks! > > Ognen > > > |
Dmitriy,
It is not a firewall issue. However, the hardware crash has something to do with it probably. In that direction - can one expect a crash of one node (out of 5) housing a few partitioned caches to affect the availability of all the caches? The strange thing is visor was able to show them all but acquiring them through a Scala app using getOrCreateCache() just hung. I ended up "rigging" visor with a capability to dump cache -scan results to a file - I was able to salvage all my data and then I restarted the cluster. Certainly pretty clumsy ;) Ognen On Tue, May 12, 2015 at 1:28 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Ognen, > > It sounds to me like this is the same issue you had recently with the cloud > node crashing due to hardware failure. If this is the case, then it sounds > like a firewall issue for me. Are you sure there is no firewall setup > between nodes and they are all deployed in the same availability zone? > > D. > > On Tue, May 12, 2015 at 1:33 PM, Yakov Zhdanov <[hidden email]> > wrote: > > > Can you please file a ticket and share your sample applicaiton with us? > > > > If it is not possible, then attach verbose logs from all the nodes and > > threaddumps from all the nodes after issue gets reproduced. > > > > Thanks! > > > > --Yakov > > > > 2015-05-12 15:30 GMT+03:00 Ognen Duzlevski <[hidden email]>: > > > > > In a partitioned cache (or set of partitioned caches) - does a single > > node > > > failure mean all of the cache(s) become unavailable? > > > > > > I am seeing a situation where I cannot access any of the caches (using > > > getOrCreateCache) - all my code just "hangs". > > > > > > The interesting thing is that visor can see all the caches and their > > > contents. > > > > > > What is so special about visor? > > > > > > I would appreciate if someone would try and answer any of these (I can > > > provide more info). as I am evaluating ignite for our use in a data > > > science/analytics setup :-) > > > > > > Thanks! > > > Ognen > > > > > > |
Ongen, can you share your test via Jira issue?
It would be very helpful if you could take logs and threaddumps from all the nodes in topology and put them all together to a Jira issue. Thanks! -- Yakov Zhdanov, Director R&D *GridGain Systems* www.gridgain.com 2015-05-12 22:33 GMT+03:00 Ognen Duzlevski <[hidden email]>: > Dmitriy, > > It is not a firewall issue. However, the hardware crash has something to do > with it probably. > > In that direction - can one expect a crash of one node (out of 5) housing a > few partitioned caches to affect the availability of all the caches? The > strange thing is visor was able to show them all but acquiring them through > a Scala app using getOrCreateCache() just hung. I ended up "rigging" visor > with a capability to dump cache -scan results to a file - I was able to > salvage all my data and then I restarted the cluster. > > Certainly pretty clumsy ;) > > Ognen > > On Tue, May 12, 2015 at 1:28 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > Ognen, > > > > It sounds to me like this is the same issue you had recently with the > cloud > > node crashing due to hardware failure. If this is the case, then it > sounds > > like a firewall issue for me. Are you sure there is no firewall setup > > between nodes and they are all deployed in the same availability zone? > > > > D. > > > > On Tue, May 12, 2015 at 1:33 PM, Yakov Zhdanov <[hidden email]> > > wrote: > > > > > Can you please file a ticket and share your sample applicaiton with us? > > > > > > If it is not possible, then attach verbose logs from all the nodes and > > > threaddumps from all the nodes after issue gets reproduced. > > > > > > Thanks! > > > > > > --Yakov > > > > > > 2015-05-12 15:30 GMT+03:00 Ognen Duzlevski <[hidden email] > >: > > > > > > > In a partitioned cache (or set of partitioned caches) - does a single > > > node > > > > failure mean all of the cache(s) become unavailable? > > > > > > > > I am seeing a situation where I cannot access any of the caches > (using > > > > getOrCreateCache) - all my code just "hangs". > > > > > > > > The interesting thing is that visor can see all the caches and their > > > > contents. > > > > > > > > What is so special about visor? > > > > > > > > I would appreciate if someone would try and answer any of these (I > can > > > > provide more info). as I am evaluating ignite for our use in a data > > > > science/analytics setup :-) > > > > > > > > Thanks! > > > > Ognen > > > > > > > > > > |
Jakov, yes - no problem, will do that today.
On Thu, May 14, 2015 at 6:15 AM, Yakov Zhdanov <[hidden email]> wrote: > Ongen, can you share your test via Jira issue? > > It would be very helpful if you could take logs and threaddumps from all > the nodes in topology and put them all together to a Jira issue. > > Thanks! > > -- > Yakov Zhdanov, Director R&D > *GridGain Systems* > www.gridgain.com > > 2015-05-12 22:33 GMT+03:00 Ognen Duzlevski <[hidden email]>: > > > Dmitriy, > > > > It is not a firewall issue. However, the hardware crash has something to > do > > with it probably. > > > > In that direction - can one expect a crash of one node (out of 5) > housing a > > few partitioned caches to affect the availability of all the caches? The > > strange thing is visor was able to show them all but acquiring them > through > > a Scala app using getOrCreateCache() just hung. I ended up "rigging" > visor > > with a capability to dump cache -scan results to a file - I was able to > > salvage all my data and then I restarted the cluster. > > > > Certainly pretty clumsy ;) > > > > Ognen > > > > On Tue, May 12, 2015 at 1:28 PM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > Ognen, > > > > > > It sounds to me like this is the same issue you had recently with the > > cloud > > > node crashing due to hardware failure. If this is the case, then it > > sounds > > > like a firewall issue for me. Are you sure there is no firewall setup > > > between nodes and they are all deployed in the same availability zone? > > > > > > D. > > > > > > On Tue, May 12, 2015 at 1:33 PM, Yakov Zhdanov <[hidden email]> > > > wrote: > > > > > > > Can you please file a ticket and share your sample applicaiton with > us? > > > > > > > > If it is not possible, then attach verbose logs from all the nodes > and > > > > threaddumps from all the nodes after issue gets reproduced. > > > > > > > > Thanks! > > > > > > > > --Yakov > > > > > > > > 2015-05-12 15:30 GMT+03:00 Ognen Duzlevski < > [hidden email] > > >: > > > > > > > > > In a partitioned cache (or set of partitioned caches) - does a > single > > > > node > > > > > failure mean all of the cache(s) become unavailable? > > > > > > > > > > I am seeing a situation where I cannot access any of the caches > > (using > > > > > getOrCreateCache) - all my code just "hangs". > > > > > > > > > > The interesting thing is that visor can see all the caches and > their > > > > > contents. > > > > > > > > > > What is so special about visor? > > > > > > > > > > I would appreciate if someone would try and answer any of these (I > > can > > > > > provide more info). as I am evaluating ignite for our use in a data > > > > > science/analytics setup :-) > > > > > > > > > > Thanks! > > > > > Ognen > > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |