Igniters,
I'm working on client-side failover logic for .NET Thin Client. This will probably apply to ODBC and JDBC thin clients as well in future. Currently all thin clients connect to a single specified Ignite node. The idea is to have multiple known nodes (host:port pairs) and reconnect to another node if current one goes down. Problems: - Protocol is stateful, server keeps track of query cursors for the session - Many operations are not idempotent, so retry is not an option - Async operations and multithreading are supported in .NET thin client So while we can detect socket connection failure and reconnect to a different node, all currently executing client operations and query cursors will still fail with an exception. I'm not sure how useful this behavior will be. Any thoughts, ideas? Thanks, Pavel |
Pavel,
I hope, that at some point Web agent (connector to Web Console) will be refactored from REST to thin client. It will be nice if thin client will support following modes: 1) Specify several addresses in thin client connection config. Thin client will use ONLY this addresses (hardcoded list). 2) Same as #1, but in addition to specified list of addresses thin client collect list of "connectable" nodes from topology (extendable list). What do you think? On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> wrote: > Igniters, > > I'm working on client-side failover logic for .NET Thin Client. > This will probably apply to ODBC and JDBC thin clients as well in future. > > Currently all thin clients connect to a single specified Ignite node. > The idea is to have multiple known nodes (host:port pairs) and reconnect > to another node if current one goes down. > > Problems: > - Protocol is stateful, server keeps track of query cursors for the session > - Many operations are not idempotent, so retry is not an option > - Async operations and multithreading are supported in .NET thin client > > So while we can detect socket connection failure and reconnect to a > different node, > all currently executing client operations and query cursors will still fail > with an exception. > > I'm not sure how useful this behavior will be. > Any thoughts, ideas? > > Thanks, > Pavel > -- Alexey Kuznetsov |
Alexey, retrieving addresses from topology makes sense, but in this thread
I'm trying to understand whether any kind of built-in failover makes sense at all at the Ignite API level. I mean, on the business logic level failover certainly makes sense: if Web Agent has failed to execute some operation, it can show an error, automatically reconnect to another node and continue working. But on the Ignite API level it gets questionable. We can implement some failover/reconnect logic, but users still has to handle failed operations themselves. Pavel On Wed, Jan 31, 2018 at 2:08 PM, Alexey Kuznetsov <[hidden email]> wrote: > Pavel, > > I hope, that at some point Web agent (connector to Web Console) will be > refactored from REST to thin client. > > It will be nice if thin client will support following modes: > 1) Specify several addresses in thin client connection config. Thin client > will use ONLY this addresses (hardcoded list). > 2) Same as #1, but in addition to specified list of addresses thin client > collect list of "connectable" nodes from topology (extendable list). > > What do you think? > > > On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Igniters, > > > > I'm working on client-side failover logic for .NET Thin Client. > > This will probably apply to ODBC and JDBC thin clients as well in future. > > > > Currently all thin clients connect to a single specified Ignite node. > > The idea is to have multiple known nodes (host:port pairs) and reconnect > > to another node if current one goes down. > > > > Problems: > > - Protocol is stateful, server keeps track of query cursors for the > session > > - Many operations are not idempotent, so retry is not an option > > - Async operations and multithreading are supported in .NET thin client > > > > So while we can detect socket connection failure and reconnect to a > > different node, > > all currently executing client operations and query cursors will still > fail > > with an exception. > > > > I'm not sure how useful this behavior will be. > > Any thoughts, ideas? > > > > Thanks, > > Pavel > > > > > > -- > Alexey Kuznetsov > |
Well, I agree with Pavel here. To me looks like this feature gives
a little to a user, as they need to write all the same amount of code as they would need to if there was no this feature. It also will produce some new issues with the "hanging" of operations, while thin client tries and fails to re-connect to several nodes. On the other hand, the second approach suggested by Alexey makes more sense to me. In general case, user does not have a list of all nodes in the cluster, so reconnection to the "next alive" node could be useful in some cases. Best Regards, Igor On Wed, Jan 31, 2018 at 2:15 PM, Pavel Tupitsyn <[hidden email]> wrote: > Alexey, retrieving addresses from topology makes sense, but in this thread > I'm trying to understand whether any kind of built-in failover > makes sense at all at the Ignite API level. > > I mean, on the business logic level failover certainly makes sense: > if Web Agent has failed to execute some operation, it can show an error, > automatically reconnect to another node and continue working. > > But on the Ignite API level it gets questionable. We can implement some > failover/reconnect logic, but users still has to handle failed operations > themselves. > > Pavel > > On Wed, Jan 31, 2018 at 2:08 PM, Alexey Kuznetsov <[hidden email]> > wrote: > > > Pavel, > > > > I hope, that at some point Web agent (connector to Web Console) will be > > refactored from REST to thin client. > > > > It will be nice if thin client will support following modes: > > 1) Specify several addresses in thin client connection config. Thin > client > > will use ONLY this addresses (hardcoded list). > > 2) Same as #1, but in addition to specified list of addresses thin client > > collect list of "connectable" nodes from topology (extendable list). > > > > What do you think? > > > > > > On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> > > wrote: > > > > > Igniters, > > > > > > I'm working on client-side failover logic for .NET Thin Client. > > > This will probably apply to ODBC and JDBC thin clients as well in > future. > > > > > > Currently all thin clients connect to a single specified Ignite node. > > > The idea is to have multiple known nodes (host:port pairs) and > reconnect > > > to another node if current one goes down. > > > > > > Problems: > > > - Protocol is stateful, server keeps track of query cursors for the > > session > > > - Many operations are not idempotent, so retry is not an option > > > - Async operations and multithreading are supported in .NET thin client > > > > > > So while we can detect socket connection failure and reconnect to a > > > different node, > > > all currently executing client operations and query cursors will still > > fail > > > with an exception. > > > > > > I'm not sure how useful this behavior will be. > > > Any thoughts, ideas? > > > > > > Thanks, > > > Pavel > > > > > > > > > > > -- > > Alexey Kuznetsov > > > |
Pavel,
I completely agree with you. No need of "black magic", just throw appropriate Exception, for example, IgniteThinClientConnectionLostException. But it will be very useful to "reconnect to next alive node" just to not reinvent the wheel in every user app. On Wed, Jan 31, 2018 at 7:21 PM, Igor Sapego <[hidden email]> wrote: > Well, I agree with Pavel here. To me looks like this feature gives > a little to a user, as they need to write all the same amount of code > as they would need to if there was no this feature. It also will produce > some new issues with the "hanging" of operations, while thin client > tries and fails to re-connect to several nodes. > > On the other hand, the second approach suggested by Alexey > makes more sense to me. In general case, user does not have > a list of all nodes in the cluster, so reconnection to the "next alive" > node could be useful in some cases. > > Best Regards, > Igor > > On Wed, Jan 31, 2018 at 2:15 PM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Alexey, retrieving addresses from topology makes sense, but in this > thread > > I'm trying to understand whether any kind of built-in failover > > makes sense at all at the Ignite API level. > > > > I mean, on the business logic level failover certainly makes sense: > > if Web Agent has failed to execute some operation, it can show an error, > > automatically reconnect to another node and continue working. > > > > But on the Ignite API level it gets questionable. We can implement some > > failover/reconnect logic, but users still has to handle failed operations > > themselves. > > > > Pavel > > > > On Wed, Jan 31, 2018 at 2:08 PM, Alexey Kuznetsov <[hidden email] > > > > wrote: > > > > > Pavel, > > > > > > I hope, that at some point Web agent (connector to Web Console) will be > > > refactored from REST to thin client. > > > > > > It will be nice if thin client will support following modes: > > > 1) Specify several addresses in thin client connection config. Thin > > client > > > will use ONLY this addresses (hardcoded list). > > > 2) Same as #1, but in addition to specified list of addresses thin > client > > > collect list of "connectable" nodes from topology (extendable list). > > > > > > What do you think? > > > > > > > > > On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> > > > wrote: > > > > > > > Igniters, > > > > > > > > I'm working on client-side failover logic for .NET Thin Client. > > > > This will probably apply to ODBC and JDBC thin clients as well in > > future. > > > > > > > > Currently all thin clients connect to a single specified Ignite node. > > > > The idea is to have multiple known nodes (host:port pairs) and > > reconnect > > > > to another node if current one goes down. > > > > > > > > Problems: > > > > - Protocol is stateful, server keeps track of query cursors for the > > > session > > > > - Many operations are not idempotent, so retry is not an option > > > > - Async operations and multithreading are supported in .NET thin > client > > > > > > > > So while we can detect socket connection failure and reconnect to a > > > > different node, > > > > all currently executing client operations and query cursors will > still > > > fail > > > > with an exception. > > > > > > > > I'm not sure how useful this behavior will be. > > > > Any thoughts, ideas? > > > > > > > > Thanks, > > > > Pavel > > > > > > > > > > > > > > > > -- > > > Alexey Kuznetsov > > > > > > -- Alexey Kuznetsov |
In reply to this post by Pavel Tupitsyn
Pavel,
I disagree. I think automatic reconnect is a very useful feature. For example, all client-side operation can throw exception anyway, so if you throw an exception due to a client reconnect, it will not require any additional exception-handling logic. On the other hand, after a few failed operations in case of a reconnect, the client will continue to operate normally. This will make our clients resilient to failures and make it way more powerful. I strongly vote to add this behavior. D. On Wed, Jan 31, 2018 at 3:15 AM, Pavel Tupitsyn <[hidden email]> wrote: > Alexey, retrieving addresses from topology makes sense, but in this thread > I'm trying to understand whether any kind of built-in failover > makes sense at all at the Ignite API level. > > I mean, on the business logic level failover certainly makes sense: > if Web Agent has failed to execute some operation, it can show an error, > automatically reconnect to another node and continue working. > > But on the Ignite API level it gets questionable. We can implement some > failover/reconnect logic, but users still has to handle failed operations > themselves. > > Pavel > > On Wed, Jan 31, 2018 at 2:08 PM, Alexey Kuznetsov <[hidden email]> > wrote: > > > Pavel, > > > > I hope, that at some point Web agent (connector to Web Console) will be > > refactored from REST to thin client. > > > > It will be nice if thin client will support following modes: > > 1) Specify several addresses in thin client connection config. Thin > client > > will use ONLY this addresses (hardcoded list). > > 2) Same as #1, but in addition to specified list of addresses thin client > > collect list of "connectable" nodes from topology (extendable list). > > > > What do you think? > > > > > > On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> > > wrote: > > > > > Igniters, > > > > > > I'm working on client-side failover logic for .NET Thin Client. > > > This will probably apply to ODBC and JDBC thin clients as well in > future. > > > > > > Currently all thin clients connect to a single specified Ignite node. > > > The idea is to have multiple known nodes (host:port pairs) and > reconnect > > > to another node if current one goes down. > > > > > > Problems: > > > - Protocol is stateful, server keeps track of query cursors for the > > session > > > - Many operations are not idempotent, so retry is not an option > > > - Async operations and multithreading are supported in .NET thin client > > > > > > So while we can detect socket connection failure and reconnect to a > > > different node, > > > all currently executing client operations and query cursors will still > > fail > > > with an exception. > > > > > > I'm not sure how useful this behavior will be. > > > Any thoughts, ideas? > > > > > > Thanks, > > > Pavel > > > > > > > > > > > -- > > Alexey Kuznetsov > > > |
Ok, let's add simple reconnect logic and see what will come of it.
On Thu, Feb 1, 2018 at 12:49 AM, Dmitriy Setrakyan <[hidden email]> wrote: > Pavel, > > I disagree. I think automatic reconnect is a very useful feature. For > example, all client-side operation can throw exception anyway, so if you > throw an exception due to a client reconnect, it will not require any > additional exception-handling logic. > > On the other hand, after a few failed operations in case of a reconnect, > the client will continue to operate normally. This will make our clients > resilient to failures and make it way more powerful. > > I strongly vote to add this behavior. > > D. > > > On Wed, Jan 31, 2018 at 3:15 AM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Alexey, retrieving addresses from topology makes sense, but in this > thread > > I'm trying to understand whether any kind of built-in failover > > makes sense at all at the Ignite API level. > > > > I mean, on the business logic level failover certainly makes sense: > > if Web Agent has failed to execute some operation, it can show an error, > > automatically reconnect to another node and continue working. > > > > But on the Ignite API level it gets questionable. We can implement some > > failover/reconnect logic, but users still has to handle failed operations > > themselves. > > > > Pavel > > > > On Wed, Jan 31, 2018 at 2:08 PM, Alexey Kuznetsov <[hidden email] > > > > wrote: > > > > > Pavel, > > > > > > I hope, that at some point Web agent (connector to Web Console) will be > > > refactored from REST to thin client. > > > > > > It will be nice if thin client will support following modes: > > > 1) Specify several addresses in thin client connection config. Thin > > client > > > will use ONLY this addresses (hardcoded list). > > > 2) Same as #1, but in addition to specified list of addresses thin > client > > > collect list of "connectable" nodes from topology (extendable list). > > > > > > What do you think? > > > > > > > > > On Wed, Jan 31, 2018 at 5:14 PM, Pavel Tupitsyn <[hidden email]> > > > wrote: > > > > > > > Igniters, > > > > > > > > I'm working on client-side failover logic for .NET Thin Client. > > > > This will probably apply to ODBC and JDBC thin clients as well in > > future. > > > > > > > > Currently all thin clients connect to a single specified Ignite node. > > > > The idea is to have multiple known nodes (host:port pairs) and > > reconnect > > > > to another node if current one goes down. > > > > > > > > Problems: > > > > - Protocol is stateful, server keeps track of query cursors for the > > > session > > > > - Many operations are not idempotent, so retry is not an option > > > > - Async operations and multithreading are supported in .NET thin > client > > > > > > > > So while we can detect socket connection failure and reconnect to a > > > > different node, > > > > all currently executing client operations and query cursors will > still > > > fail > > > > with an exception. > > > > > > > > I'm not sure how useful this behavior will be. > > > > Any thoughts, ideas? > > > > > > > > Thanks, > > > > Pavel > > > > > > > > > > > > > > > > -- > > > Alexey Kuznetsov > > > > > > |
On Thu, Feb 1, 2018 at 5:55 AM, Pavel Tupitsyn <[hidden email]> wrote:
> Ok, let's add simple reconnect logic and see what will come of it. > Just to clarify, a list of IP addresses a client should connect to needs to be provided on startup. Once a connection is lost, a client needs to try to connect to all other IPs in the list before failing. Agree, this should be fairly simple to implement. |
Dmitriy, yes, that's what I'm implementing as part of IGNITE-7282:
* List of hosts in config * Pick random index (basic load balancing), connect * When connection is lost, all current requests throw an exception * Next request causes reconnect attempt to the next host * If all hosts fail, throw exception On Fri, Feb 2, 2018 at 12:44 AM, Dmitriy Setrakyan <[hidden email]> wrote: > On Thu, Feb 1, 2018 at 5:55 AM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Ok, let's add simple reconnect logic and see what will come of it. > > > > Just to clarify, a list of IP addresses a client should connect to needs to > be provided on startup. Once a connection is lost, a client needs to try to > connect to all other IPs in the list before failing. Agree, this should be > fairly simple to implement. > |
Pavel,
Does it make sense in case of "connection lost" to ping available addresses in parallel? For example using thread pool of 4 threads? This may speed up detecting next alive node under Windows if several addresses become unavailable at once. On Fri, Feb 2, 2018 at 2:09 PM, Pavel Tupitsyn <[hidden email]> wrote: > Dmitriy, yes, that's what I'm implementing as part of IGNITE-7282: > * List of hosts in config > * Pick random index (basic load balancing), connect > * When connection is lost, all current requests throw an exception > * Next request causes reconnect attempt to the next host > * If all hosts fail, throw exception > > > On Fri, Feb 2, 2018 at 12:44 AM, Dmitriy Setrakyan <[hidden email]> > wrote: > > > On Thu, Feb 1, 2018 at 5:55 AM, Pavel Tupitsyn <[hidden email]> > > wrote: > > > > > Ok, let's add simple reconnect logic and see what will come of it. > > > > > > > Just to clarify, a list of IP addresses a client should connect to needs > to > > be provided on startup. Once a connection is lost, a client needs to try > to > > connect to all other IPs in the list before failing. Agree, this should > be > > fairly simple to implement. > > > -- Alexey Kuznetsov GridGain Systems www.gridgain.com |
Alexey, let's keep it simple for now.
On Fri, Feb 2, 2018 at 11:47 AM, Alexey Kuznetsov <[hidden email]> wrote: > Pavel, > > Does it make sense in case of "connection lost" to ping available addresses > in parallel? > For example using thread pool of 4 threads? > This may speed up detecting next alive node under Windows if several > addresses become unavailable at once. > > On Fri, Feb 2, 2018 at 2:09 PM, Pavel Tupitsyn <[hidden email]> > wrote: > > > Dmitriy, yes, that's what I'm implementing as part of IGNITE-7282: > > * List of hosts in config > > * Pick random index (basic load balancing), connect > > * When connection is lost, all current requests throw an exception > > * Next request causes reconnect attempt to the next host > > * If all hosts fail, throw exception > > > > > > On Fri, Feb 2, 2018 at 12:44 AM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > > > On Thu, Feb 1, 2018 at 5:55 AM, Pavel Tupitsyn <[hidden email]> > > > wrote: > > > > > > > Ok, let's add simple reconnect logic and see what will come of it. > > > > > > > > > > Just to clarify, a list of IP addresses a client should connect to > needs > > to > > > be provided on startup. Once a connection is lost, a client needs to > try > > to > > > connect to all other IPs in the list before failing. Agree, this should > > be > > > fairly simple to implement. > > > > > > > > > -- > Alexey Kuznetsov > GridGain Systems > www.gridgain.com > |
Free forum by Nabble | Edit this page |