Hello Igniters!
I have found some confusing behavior of atomic partitioned cache with `PRIMARY_SYNC` write synchronization mode. Node with a primary partition sends a message to remote nodes with backup partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. If during of sending occurs an error then it, in fact, will be ignored, see [1]: ``` try { .... cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); .... } catch (ClusterTopologyCheckedException ignored) { .... registerResponse(req.nodeId()); } catch (IgniteCheckedException ignored) { .... registerResponse(req.nodeId()); } ``` This behavior results in the primary partition and backup partitions have the different value for given key. There is the reproducer [2]. Should we consider this behavior as valid? [1]. https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d8908e882396 d7/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/di stributed/dht/atomic/GridDhtAtomicAbstractUpdateFuture.java#L473 [2]. https://github.com/apache/ignite/pull/4126/files#diff-5e5bfb73bd917d85f56a05 552b1d014aR26 |
Fix formatting
Hello Igniters! I have found some confusing behavior of atomic partitioned cache with `PRIMARY_SYNC` write synchronization mode. Node with a primary partition sends a message to remote nodes with backup partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. If during of sending occurs an error then it, in fact, will be ignored, see [1]: ``` try { .... cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); .... } catch (ClusterTopologyCheckedException ignored) { .... registerResponse(req.nodeId()); } catch (IgniteCheckedException ignored) { .... registerResponse(req.nodeId()); } ``` This behavior results in the primary partition and backup partitions have the different value for given key. There is the reproducer [2]. Should we consider this behavior as valid? [1]. https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d8908e882396d7/modules/core/src/main/java/org/apache/ignite/internal/processors/cache/distributed/dht/atomic/GridDhtAtomicAbstractUpdateFuture.java#L473 [2]. https://github.com/apache/ignite/pull/4126/files#diff-5e5bfb73bd917d85f56a05552b1d014aR26 2018-06-05 17:35 GMT+03:00 Denis Garus <[hidden email]>: > Hello Igniters! > > > > I have found some confusing behavior of atomic partitioned cache with > `PRIMARY_SYNC` write synchronization mode. > > Node with a primary partition sends a message to remote nodes with backup > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > If during of sending occurs an error then it, in fact, will be ignored, > see [1]: > > ``` > > try { > > .... > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > .... > > } > > catch (ClusterTopologyCheckedException ignored) { > > .... > > > > registerResponse(req.nodeId()); > > } > > catch (IgniteCheckedException ignored) { > > .... > > > > registerResponse(req.nodeId()); > > } > > ``` > > This behavior results in the primary partition and backup partitions have > the different value for given key. > > > > There is the reproducer [2]. > > > > Should we consider this behavior as valid? > > > > [1]. https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890 > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > internal/processors/cache/distributed/dht/atomic/ > GridDhtAtomicAbstractUpdateFuture.java#L473 > > [2]. https://github.com/apache/ignite/pull/4126/files#diff- > 5e5bfb73bd917d85f56a05552b1d014aR26 > |
Denis,
Seem that you right, it is a problem. I guess in this case primary node should send CachePartialUpdateException to near node. On Tue, Jun 5, 2018 at 6:13 PM, Denis Garus <[hidden email]> wrote: > Fix formatting > > Hello Igniters! > > I have found some confusing behavior of atomic partitioned cache with > `PRIMARY_SYNC` write synchronization mode. > Node with a primary partition sends a message to remote nodes with backup > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > If during of sending occurs an error then it, in fact, will be ignored, see > [1]: > ``` > try { > .... > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > .... > } > catch (ClusterTopologyCheckedException ignored) { > .... > > registerResponse(req.nodeId()); > } > catch (IgniteCheckedException ignored) { > .... > > registerResponse(req.nodeId()); > } > > ``` > This behavior results in the primary partition and backup partitions have > the different value for given key. > > There is the reproducer [2]. > > Should we consider this behavior as valid? > > [1]. > https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890 > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > internal/processors/cache/distributed/dht/atomic/ > GridDhtAtomicAbstractUpdateFuture.java#L473 > [2]. > https://github.com/apache/ignite/pull/4126/files#diff- > 5e5bfb73bd917d85f56a05552b1d014aR26 > > 2018-06-05 17:35 GMT+03:00 Denis Garus <[hidden email]>: > > > Hello Igniters! > > > > > > > > I have found some confusing behavior of atomic partitioned cache with > > `PRIMARY_SYNC` write synchronization mode. > > > > Node with a primary partition sends a message to remote nodes with backup > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > > > If during of sending occurs an error then it, in fact, will be ignored, > > see [1]: > > > > ``` > > > > try { > > > > .... > > > > > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > > > > > .... > > > > } > > > > catch (ClusterTopologyCheckedException ignored) { > > > > .... > > > > > > > > registerResponse(req.nodeId()); > > > > } > > > > catch (IgniteCheckedException ignored) { > > > > .... > > > > > > > > registerResponse(req.nodeId()); > > > > } > > > > ``` > > > > This behavior results in the primary partition and backup partitions have > > the different value for given key. > > > > > > > > There is the reproducer [2]. > > > > > > > > Should we consider this behavior as valid? > > > > > > > > [1]. https://github.com/dgarus/ignite/blob/ > d473b507f04e2ec843c1da1066d890 > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > > internal/processors/cache/distributed/dht/atomic/ > > GridDhtAtomicAbstractUpdateFuture.java#L473 > > > > [2]. https://github.com/apache/ignite/pull/4126/files#diff- > > 5e5bfb73bd917d85f56a05552b1d014aR26 > > > |
Dmitry,
There are other cases that can result in inconsistent state of Atomic cache with 2 or more backups. 1. For PRIMARY_SYNC. Primary sends requests to all backups and respond to near.... and then one of backup update fails. Will primary retry update operation? I doubt. 2. For all sync modes. Primary sends request to 1-st backup and fails to send to 2-nd backup... and then near node sudden death happens. No one will retry as near has gone. On Tue, Jun 5, 2018 at 7:16 PM, Dmitriy Govorukhin < [hidden email]> wrote: > Denis, > > Seem that you right, it is a problem. > I guess in this case primary node should send CachePartialUpdateException > to near node. > > On Tue, Jun 5, 2018 at 6:13 PM, Denis Garus <[hidden email]> wrote: > > > Fix formatting > > > > Hello Igniters! > > > > I have found some confusing behavior of atomic partitioned cache with > > `PRIMARY_SYNC` write synchronization mode. > > Node with a primary partition sends a message to remote nodes with backup > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > If during of sending occurs an error then it, in fact, will be ignored, > see > > [1]: > > ``` > > try { > > .... > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > .... > > } > > catch (ClusterTopologyCheckedException ignored) { > > .... > > > > registerResponse(req.nodeId()); > > } > > catch (IgniteCheckedException ignored) { > > .... > > > > registerResponse(req.nodeId()); > > } > > > > ``` > > This behavior results in the primary partition and backup partitions have > > the different value for given key. > > > > There is the reproducer [2]. > > > > Should we consider this behavior as valid? > > > > [1]. > > https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890 > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > > internal/processors/cache/distributed/dht/atomic/ > > GridDhtAtomicAbstractUpdateFuture.java#L473 > > [2]. > > https://github.com/apache/ignite/pull/4126/files#diff- > > 5e5bfb73bd917d85f56a05552b1d014aR26 > > > > 2018-06-05 17:35 GMT+03:00 Denis Garus <[hidden email]>: > > > > > Hello Igniters! > > > > > > > > > > > > I have found some confusing behavior of atomic partitioned cache with > > > `PRIMARY_SYNC` write synchronization mode. > > > > > > Node with a primary partition sends a message to remote nodes with > backup > > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > > > > > If during of sending occurs an error then it, in fact, will be ignored, > > > see [1]: > > > > > > ``` > > > > > > try { > > > > > > .... > > > > > > > > > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > > > > > > > > > .... > > > > > > } > > > > > > catch (ClusterTopologyCheckedException ignored) { > > > > > > .... > > > > > > > > > > > > registerResponse(req.nodeId()); > > > > > > } > > > > > > catch (IgniteCheckedException ignored) { > > > > > > .... > > > > > > > > > > > > registerResponse(req.nodeId()); > > > > > > } > > > > > > ``` > > > > > > This behavior results in the primary partition and backup partitions > have > > > the different value for given key. > > > > > > > > > > > > There is the reproducer [2]. > > > > > > > > > > > > Should we consider this behavior as valid? > > > > > > > > > > > > [1]. https://github.com/dgarus/ignite/blob/ > > d473b507f04e2ec843c1da1066d890 > > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > > > internal/processors/cache/distributed/dht/atomic/ > > > GridDhtAtomicAbstractUpdateFuture.java#L473 > > > > > > [2]. https://github.com/apache/ignite/pull/4126/files#diff- > > > 5e5bfb73bd917d85f56a05552b1d014aR26 > > > > > > -- Best regards, Andrey V. Mashenkov |
Denis, Alexey, please share you vision.
Sincerely, Dmitriy Pavlov вт, 5 июн. 2018 г. в 19:39, Andrey Mashenkov <[hidden email]>: > Dmitry, > > There are other cases that can result in inconsistent state of Atomic cache > with 2 or more backups. > > 1. For PRIMARY_SYNC. Primary sends requests to all backups and respond to > near.... and then one of backup update fails. > Will primary retry update operation? I doubt. > > 2. For all sync modes. Primary sends request to 1-st backup and fails to > send to 2-nd backup... and then near node sudden death happens. > No one will retry as near has gone. > > On Tue, Jun 5, 2018 at 7:16 PM, Dmitriy Govorukhin < > [hidden email]> wrote: > > > Denis, > > > > Seem that you right, it is a problem. > > I guess in this case primary node should send CachePartialUpdateException > > to near node. > > > > On Tue, Jun 5, 2018 at 6:13 PM, Denis Garus <[hidden email]> wrote: > > > > > Fix formatting > > > > > > Hello Igniters! > > > > > > I have found some confusing behavior of atomic partitioned cache with > > > `PRIMARY_SYNC` write synchronization mode. > > > Node with a primary partition sends a message to remote nodes with > backup > > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > > If during of sending occurs an error then it, in fact, will be ignored, > > see > > > [1]: > > > ``` > > > try { > > > .... > > > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > > > .... > > > } > > > catch (ClusterTopologyCheckedException ignored) { > > > .... > > > > > > registerResponse(req.nodeId()); > > > } > > > catch (IgniteCheckedException ignored) { > > > .... > > > > > > registerResponse(req.nodeId()); > > > } > > > > > > ``` > > > This behavior results in the primary partition and backup partitions > have > > > the different value for given key. > > > > > > There is the reproducer [2]. > > > > > > Should we consider this behavior as valid? > > > > > > [1]. > > > https://github.com/dgarus/ignite/blob/d473b507f04e2ec843c1da1066d890 > > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > > > internal/processors/cache/distributed/dht/atomic/ > > > GridDhtAtomicAbstractUpdateFuture.java#L473 > > > [2]. > > > https://github.com/apache/ignite/pull/4126/files#diff- > > > 5e5bfb73bd917d85f56a05552b1d014aR26 > > > > > > 2018-06-05 17:35 GMT+03:00 Denis Garus <[hidden email]>: > > > > > > > Hello Igniters! > > > > > > > > > > > > > > > > I have found some confusing behavior of atomic partitioned cache with > > > > `PRIMARY_SYNC` write synchronization mode. > > > > > > > > Node with a primary partition sends a message to remote nodes with > > backup > > > > partitions via `GridDhtAtomicAbstractUpdateFuture#sendDhtRequests`. > > > > > > > > If during of sending occurs an error then it, in fact, will be > ignored, > > > > see [1]: > > > > > > > > ``` > > > > > > > > try { > > > > > > > > .... > > > > > > > > > > > > > > > > cctx.io().send(req.nodeId(), req, cctx.ioPolicy()); > > > > > > > > > > > > > > > > .... > > > > > > > > } > > > > > > > > catch (ClusterTopologyCheckedException ignored) { > > > > > > > > .... > > > > > > > > > > > > > > > > registerResponse(req.nodeId()); > > > > > > > > } > > > > > > > > catch (IgniteCheckedException ignored) { > > > > > > > > .... > > > > > > > > > > > > > > > > registerResponse(req.nodeId()); > > > > > > > > } > > > > > > > > ``` > > > > > > > > This behavior results in the primary partition and backup partitions > > have > > > > the different value for given key. > > > > > > > > > > > > > > > > There is the reproducer [2]. > > > > > > > > > > > > > > > > Should we consider this behavior as valid? > > > > > > > > > > > > > > > > [1]. https://github.com/dgarus/ignite/blob/ > > > d473b507f04e2ec843c1da1066d890 > > > > 8e882396d7/modules/core/src/main/java/org/apache/ignite/ > > > > internal/processors/cache/distributed/dht/atomic/ > > > > GridDhtAtomicAbstractUpdateFuture.java#L473 > > > > > > > > [2]. https://github.com/apache/ignite/pull/4126/files#diff- > > > > 5e5bfb73bd917d85f56a05552b1d014aR26 > > > > > > > > > > > > > -- > Best regards, > Andrey V. Mashenkov > |
Free forum by Nabble | Edit this page |