Apache Ignite Developers - Legacy Mail Archive

Read load balancing, read-though, ttl and optimistic serializable transactions

Classic

List

Threaded

5 messages Options

Alexey Goncharuk

Read load balancing, read-though, ttl and optimistic serializable transactions

Igniters,

I have recently discovered [1] that Ignite can arrive in a state when an
optimistic serializable transaction can never be successfully committed
from a backup node [2].

In short, the root cause of this issue is that there are configurations
that allow a key to be stored on primary and backup nodes with different
versions. This is a fundamental design choice that made a while ago,
however, I am not sure if this is a right way to go. When primary and
backup versions differ and read load balancing is enabled, the read version
will always mismatch with primary version and optimistic serializable
transaction will always fail.

Here I wanted to discuss both short-term mitigation plan for the issue [2]
as well as a longer-term changes to replication protocol.

As a short-term solution for [2] I suggest to force reads from a primary
node inside optimistic serializable transactions. The question is whether
to enforce this behavior only if the cache has a 3-rd party persistence
storage or this behavior should be always enforced. Note that the version
mismatch may appear even without a 3-rd party persistence storage when an
expiry policy is used. However, in this case, the version mismatch is
time-bound to the TTL cleanup lag. Personally, I would go with always
enforcing primary-node reads inside an optimistic serializable transaction.

As a long-term solution which would eliminate the possibility of versions
desync on primary and backup nodes, I would suggest to revisit the
read-through and TTL expiry semantics. It looks like quite a lot of users
are actually struggling with the current implementation of read-through
because a miss does not load the value to all partition nodes [3]. As for
TTL, I remember it clearing up entries locally was a big issue for a proper
MVCC rebalance implementation (we ended up prohibiting TTL for MVCC caches).
I think it may be better to make read-through and entry expiry a
partition-wide operation with the underlying cache guarantees. For
read-through it is justified because a partition-wide operation penalty is
comparable with the cache store load anyway (otherwise, a 3rd party storage
makes little sense). For entries expiration it should not make any
difference because it happens in background anyways.

Any thoughts on the subject are very much appreciated.

--AG

[1]
http://apache-ignite-developers.2346864.n4.nabble.com/Fwd-NodeOrder-in-GridCacheVersion-td46108.html
[2] https://issues.apache.org/jira/browse/IGNITE-12739
[3]
http://apache-ignite-developers.2346864.n4.nabble.com/Re-Read-through-not-working-as-expected-in-case-of-Replicated-cache-td46083.html

Anton Vinogradov-2

Re: Read load balancing, read-though, ttl and optimistic serializable transactions

Alexey,

>> In short, the root cause of this issue is that there are configurations
>> that allow a key to be stored on primary and backup nodes with different
>> versions.
Faced with the same problem during ReadRepair development.

>> I suggest to force reads from a primary
>> node inside optimistic serializable transactions.
It looks like a proper fix (read-from-backup= ... && !read-through).

>> I would suggest to revisit the
>> read-through and TTL expiry semantics.
Do we really need these features?
- we have great full-featured consistent persistence, what's the point to
use limited-featured inconsistent persistence via the external database?
Can we get rid of this feature at 3.0?
- Expiry policy is expensive (slowdown the cluster) and does not guarantee
the in-time removal, and always may be replaced by proper design (state
machine, query, eviction, in-memory cluster restart, etc).

On Thu, Mar 5, 2020 at 12:33 PM Alexey Goncharuk <[hidden email]>
wrote:

> Igniters,
>
> I have recently discovered [1] that Ignite can arrive in a state when an
> optimistic serializable transaction can never be successfully committed
> from a backup node [2].
>
> In short, the root cause of this issue is that there are configurations
> that allow a key to be stored on primary and backup nodes with different
> versions. This is a fundamental design choice that made a while ago,
> however, I am not sure if this is a right way to go. When primary and
> backup versions differ and read load balancing is enabled, the read version
> will always mismatch with primary version and optimistic serializable
> transaction will always fail.
>
> Here I wanted to discuss both short-term mitigation plan for the issue [2]
> as well as a longer-term changes to replication protocol.
>
> As a short-term solution for [2] I suggest to force reads from a primary
> node inside optimistic serializable transactions. The question is whether
> to enforce this behavior only if the cache has a 3-rd party persistence
> storage or this behavior should be always enforced. Note that the version
> mismatch may appear even without a 3-rd party persistence storage when an
> expiry policy is used. However, in this case, the version mismatch is
> time-bound to the TTL cleanup lag. Personally, I would go with always
> enforcing primary-node reads inside an optimistic serializable transaction.
>
> As a long-term solution which would eliminate the possibility of versions
> desync on primary and backup nodes, I would suggest to revisit the
> read-through and TTL expiry semantics. It looks like quite a lot of users
> are actually struggling with the current implementation of read-through
> because a miss does not load the value to all partition nodes [3]. As for
> TTL, I remember it clearing up entries locally was a big issue for a proper
> MVCC rebalance implementation (we ended up prohibiting TTL for MVCC
> caches).
> I think it may be better to make read-through and entry expiry a
> partition-wide operation with the underlying cache guarantees. For
> read-through it is justified because a partition-wide operation penalty is
> comparable with the cache store load anyway (otherwise, a 3rd party storage
> makes little sense). For entries expiration it should not make any
> difference because it happens in background anyways.
>
> Any thoughts on the subject are very much appreciated.
>
> --AG
>
> [1]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/Fwd-NodeOrder-in-GridCacheVersion-td46108.html
> [2] https://issues.apache.org/jira/browse/IGNITE-12739
> [3]
>
> http://apache-ignite-developers.2346864.n4.nabble.com/Re-Read-through-not-working-as-expected-in-case-of-Replicated-cache-td46083.html
>

Alexey Goncharuk

Re: Read load balancing, read-though, ttl and optimistic serializable transactions

Anton,

> >> In short, the root cause of this issue is that there are configurations
> >> that allow a key to be stored on primary and backup nodes with different
> >> versions.
> Faced with the same problem during ReadRepair development.
>
> >> I suggest to force reads from a primary
> >> node inside optimistic serializable transactions.
> It looks like a proper fix (read-from-backup= ... && !read-through).
>
> >> I would suggest to revisit the
> >> read-through and TTL expiry semantics.
> Do we really need these features?
> - we have great full-featured consistent persistence, what's the point to
> use limited-featured inconsistent persistence via the external database?
> Can we get rid of this feature at 3.0?
> - Expiry policy is expensive (slowdown the cluster) and does not guarantee
> the in-time removal, and always may be replaced by proper design (state
> machine, query, eviction, in-memory cluster restart, etc).
>

Caching a 3rd-party persistence is one of the most widely used Ignite
use-cases, I am sure we cannot drop this. Perhaps, it makes sense to
separate the caching scenario in an explicit configuration and cache mode.
Probably, even separate cache and database cases.

As for expiry policy - I agree that a user can always implement it on
application level, but a user can always implement transactions as well. If
we already have a feature and we can fix it properly, why not keep it?

dmagda

Re: Read load balancing, read-though, ttl and optimistic serializable transactions

Alex, thanks for monitoring various discussion threads and sharing these
problems with the rest of the dev community.

>> As a short-term solution for [2] I suggest to force reads from a primary
> node inside optimistic serializable transactions.

Totally agree on this. Anyway, consistency and predictable behavior matter
most. Also, it shouldn't affect performance anyhow dramatically.

>> I think it may be better to make read-through and entry expiry a
> partition-wide operation with the underlying cache guarantees.

That's a pain in the neck! As you properly mentioned, an in-memory data
grid sitting on top of an external database is still our dominating use
case. So, a partition-wide operation assumes that if a record is read from
a CacheStore than its value will be replicated to all the primary and
backup copies, right?

-
Denis

On Fri, Mar 6, 2020 at 12:58 AM Alexey Goncharuk <[hidden email]>
wrote:

> Anton,
>
>
> > >> In short, the root cause of this issue is that there are
> configurations
> > >> that allow a key to be stored on primary and backup nodes with
> different
> > >> versions.
> > Faced with the same problem during ReadRepair development.
> >
> > >> I suggest to force reads from a primary
> > >> node inside optimistic serializable transactions.
> > It looks like a proper fix (read-from-backup= ... && !read-through).
> >
> > >> I would suggest to revisit the
> > >> read-through and TTL expiry semantics.
> > Do we really need these features?
> > - we have great full-featured consistent persistence, what's the point to
> > use limited-featured inconsistent persistence via the external database?
> > Can we get rid of this feature at 3.0?
> > - Expiry policy is expensive (slowdown the cluster) and does not
> guarantee
> > the in-time removal, and always may be replaced by proper design (state
> > machine, query, eviction, in-memory cluster restart, etc).
> >
>
> Caching a 3rd-party persistence is one of the most widely used Ignite
> use-cases, I am sure we cannot drop this. Perhaps, it makes sense to
> separate the caching scenario in an explicit configuration and cache mode.
> Probably, even separate cache and database cases.
>
> As for expiry policy - I agree that a user can always implement it on
> application level, but a user can always implement transactions as well. If
> we already have a feature and we can fix it properly, why not keep it?
>

Alexey Goncharuk

Re: Read load balancing, read-though, ttl and optimistic serializable transactions

>
>
> Alex, thanks for monitoring various discussion threads and sharing these
> problems with the rest of the dev community.
>
> >> As a short-term solution for [2] I suggest to force reads from a primary
> > node inside optimistic serializable transactions.
>
>
> Totally agree on this. Anyway, consistency and predictable behavior matter
> most. Also, it shouldn't affect performance anyhow dramatically.
>
> >> I think it may be better to make read-through and entry expiry a
> > partition-wide operation with the underlying cache guarantees.
>
>
> That's a pain in the neck! As you properly mentioned, an in-memory data
> grid sitting on top of an external database is still our dominating use
> case. So, a partition-wide operation assumes that if a record is read from
> a CacheStore than its value will be replicated to all the primary and
> backup copies, right?
>

Right. As I noted before, I assume the CacheStore load is an expensive
operation itself, so this change should not add significant degradation to
the cache miss.