Hi, as suggested by Ilya here:
http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html I'm resending it to the developers list. From that thread we know that there might be duplicates between initial query results and listener entries received as part of continuous query. That means that users need to manually dedupe data. In my opinion the manual deduplication in some use cases may lead to possible memory problems on the client side. In order to remove duplicated notifications which we are receiving in the local listener, we need to keep all initial query results in memory (or at least their unique ids). Unfortunately, there is no way (is there?) to find a point in time when we can be sure that no dups will arrive anymore. That would mean that we need to keep that data indefinitely and use it every time a new notification arrives. In case of multiple continuous queries run from a single JVM, this might eventually become a memory or performance problem. I can see the following possible improvements to Ignite: 1. The deduplication between initial query and incoming notification could be done fully in Ignite. As far as I know there is already the updateCounter and partition id for all the objects so it could be used internally. 2. Add a guarantee that notifications arriving in the local listener after query() method returns are not duplicates. This kind of functionality would require a specific synchronization inside Ignite. It would also mean that the query() method cannot return before all potential duplicates are processed by a local listener what looks wrong. 3. Notify users that starting from a given notification they can be sure they will not receive any duplicates anymore. This could be an additional boolean flag in the CacheQueryEntryEvent. 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. Unfortunately we don't have this information for initial query results. If we had, a client could manually deduplicate notifications and get rid of initial query results for a given partition after newer notifications arrive. Also it would be very convenient to expose partition id as well but now we can figure it out using the affinity service. The assumption here is that notifications are ordered by partitionUpdateCounter (is it true?). Please correct me if I'm missing anything. What do you think? Piotr |
Hello Piotr,
That's a known problem and I thought a JIRA ticket already exists. However, failed to locate it. The ticket for the improvement should be created as a result of this conversation. Speaking of an initial query type, I would differentiate from ScanQueries and SqlQueries. For the former, it sounds reasonable to apply the partitionCounter logic. As for the latter, Vladimir Ozerov will it be addressed as part of MVCC/Transactional SQL activities? Btw, Piotr what's your initial query type? -- Denis On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański <[hidden email]> wrote: > Hi, as suggested by Ilya here: > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > I'm resending it to the developers list. > > From that thread we know that there might be duplicates between initial > query results and listener entries received as part of continuous query. > That means that users need to manually dedupe data. > > In my opinion the manual deduplication in some use cases may lead to > possible memory problems on the client side. In order to remove duplicated > notifications which we are receiving in the local listener, we need to keep > all initial query results in memory (or at least their unique ids). > Unfortunately, there is no way (is there?) to find a point in time when we > can be sure that no dups will arrive anymore. That would mean that we need > to keep that data indefinitely and use it every time a new notification > arrives. In case of multiple continuous queries run from a single JVM, this > might eventually become a memory or performance problem. I can see the > following possible improvements to Ignite: > > 1. The deduplication between initial query and incoming notification could > be done fully in Ignite. As far as I know there is already the > updateCounter and partition id for all the objects so it could be used > internally. > > 2. Add a guarantee that notifications arriving in the local listener after > query() method returns are not duplicates. This kind of functionality would > require a specific synchronization inside Ignite. It would also mean that > the query() method cannot return before all potential duplicates are > processed by a local listener what looks wrong. > > 3. Notify users that starting from a given notification they can be sure > they will not receive any duplicates anymore. This could be an additional > boolean flag in the CacheQueryEntryEvent. > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. > Unfortunately we don't have this information for initial query results. If > we had, a client could manually deduplicate notifications and get rid of > initial query results for a given partition after newer notifications > arrive. Also it would be very convenient to expose partition id as well but > now we can figure it out using the affinity service. The assumption here is > that notifications are ordered by partitionUpdateCounter (is it true?). > > Please correct me if I'm missing anything. > > What do you think? > > Piotr > |
Partition counter is internal implemenattion detail, which has no sensible
meaning to end users. It should not be exposed through public API. On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> wrote: > Hello Piotr, > > That's a known problem and I thought a JIRA ticket already exists. However, > failed to locate it. The ticket for the improvement should be created as a > result of this conversation. > > Speaking of an initial query type, I would differentiate from ScanQueries > and SqlQueries. For the former, it sounds reasonable to apply the > partitionCounter logic. As for the latter, Vladimir Ozerov will it be > addressed as part of MVCC/Transactional SQL activities? > > Btw, Piotr what's your initial query type? > > -- > Denis > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański <[hidden email]> > wrote: > > > Hi, as suggested by Ilya here: > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > I'm resending it to the developers list. > > > > From that thread we know that there might be duplicates between initial > > query results and listener entries received as part of continuous query. > > That means that users need to manually dedupe data. > > > > In my opinion the manual deduplication in some use cases may lead to > > possible memory problems on the client side. In order to remove > duplicated > > notifications which we are receiving in the local listener, we need to > keep > > all initial query results in memory (or at least their unique ids). > > Unfortunately, there is no way (is there?) to find a point in time when > we > > can be sure that no dups will arrive anymore. That would mean that we > need > > to keep that data indefinitely and use it every time a new notification > > arrives. In case of multiple continuous queries run from a single JVM, > this > > might eventually become a memory or performance problem. I can see the > > following possible improvements to Ignite: > > > > 1. The deduplication between initial query and incoming notification > could > > be done fully in Ignite. As far as I know there is already the > > updateCounter and partition id for all the objects so it could be used > > internally. > > > > 2. Add a guarantee that notifications arriving in the local listener > after > > query() method returns are not duplicates. This kind of functionality > would > > require a specific synchronization inside Ignite. It would also mean that > > the query() method cannot return before all potential duplicates are > > processed by a local listener what looks wrong. > > > > 3. Notify users that starting from a given notification they can be sure > > they will not receive any duplicates anymore. This could be an additional > > boolean flag in the CacheQueryEntryEvent. > > > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. > > Unfortunately we don't have this information for initial query results. > If > > we had, a client could manually deduplicate notifications and get rid of > > initial query results for a given partition after newer notifications > > arrive. Also it would be very convenient to expose partition id as well > but > > now we can figure it out using the affinity service. The assumption here > is > > that notifications are ordered by partitionUpdateCounter (is it true?). > > > > Please correct me if I'm missing anything. > > > > What do you think? > > > > Piotr > > > |
Vladimir,
The partition counter is supposed to be used internally to solve the duplication issue. Does it sound like a right approach then? What would be an approach for SQL queries? Not sure the partition counter is applicable. -- Denis On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <[hidden email]> wrote: > Partition counter is internal implemenattion detail, which has no sensible > meaning to end users. It should not be exposed through public API. > > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> wrote: > > > Hello Piotr, > > > > That's a known problem and I thought a JIRA ticket already exists. > However, > > failed to locate it. The ticket for the improvement should be created as > a > > result of this conversation. > > > > Speaking of an initial query type, I would differentiate from ScanQueries > > and SqlQueries. For the former, it sounds reasonable to apply the > > partitionCounter logic. As for the latter, Vladimir Ozerov will it be > > addressed as part of MVCC/Transactional SQL activities? > > > > Btw, Piotr what's your initial query type? > > > > -- > > Denis > > > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański <[hidden email] > > > > wrote: > > > > > Hi, as suggested by Ilya here: > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > I'm resending it to the developers list. > > > > > > From that thread we know that there might be duplicates between initial > > > query results and listener entries received as part of continuous > query. > > > That means that users need to manually dedupe data. > > > > > > In my opinion the manual deduplication in some use cases may lead to > > > possible memory problems on the client side. In order to remove > > duplicated > > > notifications which we are receiving in the local listener, we need to > > keep > > > all initial query results in memory (or at least their unique ids). > > > Unfortunately, there is no way (is there?) to find a point in time when > > we > > > can be sure that no dups will arrive anymore. That would mean that we > > need > > > to keep that data indefinitely and use it every time a new notification > > > arrives. In case of multiple continuous queries run from a single JVM, > > this > > > might eventually become a memory or performance problem. I can see the > > > following possible improvements to Ignite: > > > > > > 1. The deduplication between initial query and incoming notification > > could > > > be done fully in Ignite. As far as I know there is already the > > > updateCounter and partition id for all the objects so it could be used > > > internally. > > > > > > 2. Add a guarantee that notifications arriving in the local listener > > after > > > query() method returns are not duplicates. This kind of functionality > > would > > > require a specific synchronization inside Ignite. It would also mean > that > > > the query() method cannot return before all potential duplicates are > > > processed by a local listener what looks wrong. > > > > > > 3. Notify users that starting from a given notification they can be > sure > > > they will not receive any duplicates anymore. This could be an > additional > > > boolean flag in the CacheQueryEntryEvent. > > > > > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. > > > Unfortunately we don't have this information for initial query results. > > If > > > we had, a client could manually deduplicate notifications and get rid > of > > > initial query results for a given partition after newer notifications > > > arrive. Also it would be very convenient to expose partition id as well > > but > > > now we can figure it out using the affinity service. The assumption > here > > is > > > that notifications are ordered by partitionUpdateCounter (is it true?). > > > > > > Please correct me if I'm missing anything. > > > > > > What do you think? > > > > > > Piotr > > > > > > |
Denis,
Not really. They are used to ensure that ordering of notifications is consistent with ordering of updates, so that when a key K is updated to V1, then V2, then V3, you never observe V1 -> V3 -> V2. It also solves duplicate notification problem in case of node failures, when the same update is delivered twice. However, partition counters are unable to solve duplicates problem in general. Essentially, the question is how to get consistent view on some data plus all notifications which happened afterwards. There are only two ways to achieve this - either lock entries during initial query, or take a kind of consistent data snapshot. The former was never implemented in Ignite - our Scan and SQL queries do not user locking. The latter is achievable in theory with MVCC. I raised that question earlier [1] (see p.2), and we came to conclusion that it might be a good feature for the product. It is not implemented that way for MVCC now, but most probably is not extraordinary difficult to implement. Vladimir. [1] http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> wrote: > Vladimir, > > The partition counter is supposed to be used internally to solve the > duplication issue. Does it sound like a right approach then? > > What would be an approach for SQL queries? Not sure the partition counter > is applicable. > > -- > Denis > > On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <[hidden email]> > wrote: > > > Partition counter is internal implemenattion detail, which has no > sensible > > meaning to end users. It should not be exposed through public API. > > > > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> wrote: > > > > > Hello Piotr, > > > > > > That's a known problem and I thought a JIRA ticket already exists. > > However, > > > failed to locate it. The ticket for the improvement should be created > as > > a > > > result of this conversation. > > > > > > Speaking of an initial query type, I would differentiate from > ScanQueries > > > and SqlQueries. For the former, it sounds reasonable to apply the > > > partitionCounter logic. As for the latter, Vladimir Ozerov will it be > > > addressed as part of MVCC/Transactional SQL activities? > > > > > > Btw, Piotr what's your initial query type? > > > > > > -- > > > Denis > > > > > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > [hidden email] > > > > > > wrote: > > > > > > > Hi, as suggested by Ilya here: > > > > > > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > > I'm resending it to the developers list. > > > > > > > > From that thread we know that there might be duplicates between > initial > > > > query results and listener entries received as part of continuous > > query. > > > > That means that users need to manually dedupe data. > > > > > > > > In my opinion the manual deduplication in some use cases may lead to > > > > possible memory problems on the client side. In order to remove > > > duplicated > > > > notifications which we are receiving in the local listener, we need > to > > > keep > > > > all initial query results in memory (or at least their unique ids). > > > > Unfortunately, there is no way (is there?) to find a point in time > when > > > we > > > > can be sure that no dups will arrive anymore. That would mean that we > > > need > > > > to keep that data indefinitely and use it every time a new > notification > > > > arrives. In case of multiple continuous queries run from a single > JVM, > > > this > > > > might eventually become a memory or performance problem. I can see > the > > > > following possible improvements to Ignite: > > > > > > > > 1. The deduplication between initial query and incoming notification > > > could > > > > be done fully in Ignite. As far as I know there is already the > > > > updateCounter and partition id for all the objects so it could be > used > > > > internally. > > > > > > > > 2. Add a guarantee that notifications arriving in the local listener > > > after > > > > query() method returns are not duplicates. This kind of functionality > > > would > > > > require a specific synchronization inside Ignite. It would also mean > > that > > > > the query() method cannot return before all potential duplicates are > > > > processed by a local listener what looks wrong. > > > > > > > > 3. Notify users that starting from a given notification they can be > > sure > > > > they will not receive any duplicates anymore. This could be an > > additional > > > > boolean flag in the CacheQueryEntryEvent. > > > > > > > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. > > > > Unfortunately we don't have this information for initial query > results. > > > If > > > > we had, a client could manually deduplicate notifications and get rid > > of > > > > initial query results for a given partition after newer notifications > > > > arrive. Also it would be very convenient to expose partition id as > well > > > but > > > > now we can figure it out using the affinity service. The assumption > > here > > > is > > > > that notifications are ordered by partitionUpdateCounter (is it > true?). > > > > > > > > Please correct me if I'm missing anything. > > > > > > > > What do you think? > > > > > > > > Piotr > > > > > > > > > > |
[1]
http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <[hidden email]> wrote: > Denis, > > Not really. They are used to ensure that ordering of notifications is > consistent with ordering of updates, so that when a key K is updated to V1, > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > duplicate notification problem in case of node failures, when the same > update is delivered twice. > > However, partition counters are unable to solve duplicates problem in > general. Essentially, the question is how to get consistent view on some > data plus all notifications which happened afterwards. There are only two > ways to achieve this - either lock entries during initial query, or take a > kind of consistent data snapshot. The former was never implemented in > Ignite - our Scan and SQL queries do not user locking. The latter is > achievable in theory with MVCC. I raised that question earlier [1] (see > p.2), and we came to conclusion that it might be a good feature for the > product. It is not implemented that way for MVCC now, but most probably is > not extraordinary difficult to implement. > > Vladimir. > > [1] > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> wrote: > >> Vladimir, >> >> The partition counter is supposed to be used internally to solve the >> duplication issue. Does it sound like a right approach then? >> >> What would be an approach for SQL queries? Not sure the partition counter >> is applicable. >> >> -- >> Denis >> >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <[hidden email]> >> wrote: >> >> > Partition counter is internal implemenattion detail, which has no >> sensible >> > meaning to end users. It should not be exposed through public API. >> > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> wrote: >> > >> > > Hello Piotr, >> > > >> > > That's a known problem and I thought a JIRA ticket already exists. >> > However, >> > > failed to locate it. The ticket for the improvement should be created >> as >> > a >> > > result of this conversation. >> > > >> > > Speaking of an initial query type, I would differentiate from >> ScanQueries >> > > and SqlQueries. For the former, it sounds reasonable to apply the >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will it be >> > > addressed as part of MVCC/Transactional SQL activities? >> > > >> > > Btw, Piotr what's your initial query type? >> > > >> > > -- >> > > Denis >> > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < >> [hidden email] >> > > >> > > wrote: >> > > >> > > > Hi, as suggested by Ilya here: >> > > > >> > > > >> > > >> > >> http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html >> > > > I'm resending it to the developers list. >> > > > >> > > > From that thread we know that there might be duplicates between >> initial >> > > > query results and listener entries received as part of continuous >> > query. >> > > > That means that users need to manually dedupe data. >> > > > >> > > > In my opinion the manual deduplication in some use cases may lead to >> > > > possible memory problems on the client side. In order to remove >> > > duplicated >> > > > notifications which we are receiving in the local listener, we need >> to >> > > keep >> > > > all initial query results in memory (or at least their unique ids). >> > > > Unfortunately, there is no way (is there?) to find a point in time >> when >> > > we >> > > > can be sure that no dups will arrive anymore. That would mean that >> we >> > > need >> > > > to keep that data indefinitely and use it every time a new >> notification >> > > > arrives. In case of multiple continuous queries run from a single >> JVM, >> > > this >> > > > might eventually become a memory or performance problem. I can see >> the >> > > > following possible improvements to Ignite: >> > > > >> > > > 1. The deduplication between initial query and incoming notification >> > > could >> > > > be done fully in Ignite. As far as I know there is already the >> > > > updateCounter and partition id for all the objects so it could be >> used >> > > > internally. >> > > > >> > > > 2. Add a guarantee that notifications arriving in the local listener >> > > after >> > > > query() method returns are not duplicates. This kind of >> functionality >> > > would >> > > > require a specific synchronization inside Ignite. It would also mean >> > that >> > > > the query() method cannot return before all potential duplicates are >> > > > processed by a local listener what looks wrong. >> > > > >> > > > 3. Notify users that starting from a given notification they can be >> > sure >> > > > they will not receive any duplicates anymore. This could be an >> > additional >> > > > boolean flag in the CacheQueryEntryEvent. >> > > > >> > > > 4. CacheQueryEntryEvent already exposes the partitionUpdateCounter. >> > > > Unfortunately we don't have this information for initial query >> results. >> > > If >> > > > we had, a client could manually deduplicate notifications and get >> rid >> > of >> > > > initial query results for a given partition after newer >> notifications >> > > > arrive. Also it would be very convenient to expose partition id as >> well >> > > but >> > > > now we can figure it out using the affinity service. The assumption >> > here >> > > is >> > > > that notifications are ordered by partitionUpdateCounter (is it >> true?). >> > > > >> > > > Please correct me if I'm missing anything. >> > > > >> > > > What do you think? >> > > > >> > > > Piotr >> > > > >> > > >> > >> > |
Guys, FYI:
Partition counters are already a part of the public API. The following method reveals this information: CacheQueryEntryEvent#getPartitionUpdateCounter() <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/query/CacheQueryEntryEvent.html#getPartitionUpdateCounter--> I also think, that this kind of information shouldn't be accessible by user, but I don't see, how to prevent the duplication problem with it neither. Denis чт, 13 дек. 2018 г. в 23:40, Vladimir Ozerov <[hidden email]>: > [1] > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <[hidden email]> > wrote: > > > Denis, > > > > Not really. They are used to ensure that ordering of notifications is > > consistent with ordering of updates, so that when a key K is updated to > V1, > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > duplicate notification problem in case of node failures, when the same > > update is delivered twice. > > > > However, partition counters are unable to solve duplicates problem in > > general. Essentially, the question is how to get consistent view on some > > data plus all notifications which happened afterwards. There are only two > > ways to achieve this - either lock entries during initial query, or take > a > > kind of consistent data snapshot. The former was never implemented in > > Ignite - our Scan and SQL queries do not user locking. The latter is > > achievable in theory with MVCC. I raised that question earlier [1] (see > > p.2), and we came to conclusion that it might be a good feature for the > > product. It is not implemented that way for MVCC now, but most probably > is > > not extraordinary difficult to implement. > > > > Vladimir. > > > > [1] > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> wrote: > > > >> Vladimir, > >> > >> The partition counter is supposed to be used internally to solve the > >> duplication issue. Does it sound like a right approach then? > >> > >> What would be an approach for SQL queries? Not sure the partition > counter > >> is applicable. > >> > >> -- > >> Denis > >> > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <[hidden email]> > >> wrote: > >> > >> > Partition counter is internal implemenattion detail, which has no > >> sensible > >> > meaning to end users. It should not be exposed through public API. > >> > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> > wrote: > >> > > >> > > Hello Piotr, > >> > > > >> > > That's a known problem and I thought a JIRA ticket already exists. > >> > However, > >> > > failed to locate it. The ticket for the improvement should be > created > >> as > >> > a > >> > > result of this conversation. > >> > > > >> > > Speaking of an initial query type, I would differentiate from > >> ScanQueries > >> > > and SqlQueries. For the former, it sounds reasonable to apply the > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will it > be > >> > > addressed as part of MVCC/Transactional SQL activities? > >> > > > >> > > Btw, Piotr what's your initial query type? > >> > > > >> > > -- > >> > > Denis > >> > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > >> [hidden email] > >> > > > >> > > wrote: > >> > > > >> > > > Hi, as suggested by Ilya here: > >> > > > > >> > > > > >> > > > >> > > >> > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > >> > > > I'm resending it to the developers list. > >> > > > > >> > > > From that thread we know that there might be duplicates between > >> initial > >> > > > query results and listener entries received as part of continuous > >> > query. > >> > > > That means that users need to manually dedupe data. > >> > > > > >> > > > In my opinion the manual deduplication in some use cases may lead > to > >> > > > possible memory problems on the client side. In order to remove > >> > > duplicated > >> > > > notifications which we are receiving in the local listener, we > need > >> to > >> > > keep > >> > > > all initial query results in memory (or at least their unique > ids). > >> > > > Unfortunately, there is no way (is there?) to find a point in time > >> when > >> > > we > >> > > > can be sure that no dups will arrive anymore. That would mean that > >> we > >> > > need > >> > > > to keep that data indefinitely and use it every time a new > >> notification > >> > > > arrives. In case of multiple continuous queries run from a single > >> JVM, > >> > > this > >> > > > might eventually become a memory or performance problem. I can see > >> the > >> > > > following possible improvements to Ignite: > >> > > > > >> > > > 1. The deduplication between initial query and incoming > notification > >> > > could > >> > > > be done fully in Ignite. As far as I know there is already the > >> > > > updateCounter and partition id for all the objects so it could be > >> used > >> > > > internally. > >> > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > listener > >> > > after > >> > > > query() method returns are not duplicates. This kind of > >> functionality > >> > > would > >> > > > require a specific synchronization inside Ignite. It would also > mean > >> > that > >> > > > the query() method cannot return before all potential duplicates > are > >> > > > processed by a local listener what looks wrong. > >> > > > > >> > > > 3. Notify users that starting from a given notification they can > be > >> > sure > >> > > > they will not receive any duplicates anymore. This could be an > >> > additional > >> > > > boolean flag in the CacheQueryEntryEvent. > >> > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > partitionUpdateCounter. > >> > > > Unfortunately we don't have this information for initial query > >> results. > >> > > If > >> > > > we had, a client could manually deduplicate notifications and get > >> rid > >> > of > >> > > > initial query results for a given partition after newer > >> notifications > >> > > > arrive. Also it would be very convenient to expose partition id as > >> well > >> > > but > >> > > > now we can figure it out using the affinity service. The > assumption > >> > here > >> > > is > >> > > > that notifications are ordered by partitionUpdateCounter (is it > >> true?). > >> > > > > >> > > > Please correct me if I'm missing anything. > >> > > > > >> > > > What do you think? > >> > > > > >> > > > Piotr > >> > > > > >> > > > >> > > >> > > > |
In reply to this post by Vladimir Ozerov
Vladimir,
Thanks for referring to the MVCC and Continuous Queries discussion, I knew that saw us discussing a solution of the duplication problem. Let me copy and paste it in here for others: 2) *Initial query*. We implemented it so that user can get some initial > data snapshot and then start receiving events. Without MVCC we have no > guarantees of visibility. E.g. if key is updated from V1 to V2, it is > possible to see V2 in initial query and in event. With MVCC it is now > technically possible to query data on certain snapshot and then receive > only events happened after this snapshot. So that we never see V2 twice. > Do > you think we this feature will be interesting for our users? Am I right that this would be a generic solution - whether you use Scan or SQL query as an initial one? Have we planned it for the transactional SQL GA or it's out of scope for now? -- Denis On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <[hidden email]> wrote: > [1] > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <[hidden email]> > wrote: > > > Denis, > > > > Not really. They are used to ensure that ordering of notifications is > > consistent with ordering of updates, so that when a key K is updated to > V1, > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > duplicate notification problem in case of node failures, when the same > > update is delivered twice. > > > > However, partition counters are unable to solve duplicates problem in > > general. Essentially, the question is how to get consistent view on some > > data plus all notifications which happened afterwards. There are only two > > ways to achieve this - either lock entries during initial query, or take > a > > kind of consistent data snapshot. The former was never implemented in > > Ignite - our Scan and SQL queries do not user locking. The latter is > > achievable in theory with MVCC. I raised that question earlier [1] (see > > p.2), and we came to conclusion that it might be a good feature for the > > product. It is not implemented that way for MVCC now, but most probably > is > > not extraordinary difficult to implement. > > > > Vladimir. > > > > [1] > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> wrote: > > > >> Vladimir, > >> > >> The partition counter is supposed to be used internally to solve the > >> duplication issue. Does it sound like a right approach then? > >> > >> What would be an approach for SQL queries? Not sure the partition > counter > >> is applicable. > >> > >> -- > >> Denis > >> > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov <[hidden email]> > >> wrote: > >> > >> > Partition counter is internal implemenattion detail, which has no > >> sensible > >> > meaning to end users. It should not be exposed through public API. > >> > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> > wrote: > >> > > >> > > Hello Piotr, > >> > > > >> > > That's a known problem and I thought a JIRA ticket already exists. > >> > However, > >> > > failed to locate it. The ticket for the improvement should be > created > >> as > >> > a > >> > > result of this conversation. > >> > > > >> > > Speaking of an initial query type, I would differentiate from > >> ScanQueries > >> > > and SqlQueries. For the former, it sounds reasonable to apply the > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will it > be > >> > > addressed as part of MVCC/Transactional SQL activities? > >> > > > >> > > Btw, Piotr what's your initial query type? > >> > > > >> > > -- > >> > > Denis > >> > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > >> [hidden email] > >> > > > >> > > wrote: > >> > > > >> > > > Hi, as suggested by Ilya here: > >> > > > > >> > > > > >> > > > >> > > >> > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > >> > > > I'm resending it to the developers list. > >> > > > > >> > > > From that thread we know that there might be duplicates between > >> initial > >> > > > query results and listener entries received as part of continuous > >> > query. > >> > > > That means that users need to manually dedupe data. > >> > > > > >> > > > In my opinion the manual deduplication in some use cases may lead > to > >> > > > possible memory problems on the client side. In order to remove > >> > > duplicated > >> > > > notifications which we are receiving in the local listener, we > need > >> to > >> > > keep > >> > > > all initial query results in memory (or at least their unique > ids). > >> > > > Unfortunately, there is no way (is there?) to find a point in time > >> when > >> > > we > >> > > > can be sure that no dups will arrive anymore. That would mean that > >> we > >> > > need > >> > > > to keep that data indefinitely and use it every time a new > >> notification > >> > > > arrives. In case of multiple continuous queries run from a single > >> JVM, > >> > > this > >> > > > might eventually become a memory or performance problem. I can see > >> the > >> > > > following possible improvements to Ignite: > >> > > > > >> > > > 1. The deduplication between initial query and incoming > notification > >> > > could > >> > > > be done fully in Ignite. As far as I know there is already the > >> > > > updateCounter and partition id for all the objects so it could be > >> used > >> > > > internally. > >> > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > listener > >> > > after > >> > > > query() method returns are not duplicates. This kind of > >> functionality > >> > > would > >> > > > require a specific synchronization inside Ignite. It would also > mean > >> > that > >> > > > the query() method cannot return before all potential duplicates > are > >> > > > processed by a local listener what looks wrong. > >> > > > > >> > > > 3. Notify users that starting from a given notification they can > be > >> > sure > >> > > > they will not receive any duplicates anymore. This could be an > >> > additional > >> > > > boolean flag in the CacheQueryEntryEvent. > >> > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > partitionUpdateCounter. > >> > > > Unfortunately we don't have this information for initial query > >> results. > >> > > If > >> > > > we had, a client could manually deduplicate notifications and get > >> rid > >> > of > >> > > > initial query results for a given partition after newer > >> notifications > >> > > > arrive. Also it would be very convenient to expose partition id as > >> well > >> > > but > >> > > > now we can figure it out using the affinity service. The > assumption > >> > here > >> > > is > >> > > > that notifications are ordered by partitionUpdateCounter (is it > >> true?). > >> > > > > >> > > > Please correct me if I'm missing anything. > >> > > > > >> > > > What do you think? > >> > > > > >> > > > Piotr > >> > > > > >> > > > >> > > >> > > > |
Hi all, sorry for answering so late.
I would like to use SqlQuery because I can leverage indexes there. As it was already mentioned earlier, the partition update counter is exposed through CacheQueryEntryEvent. Initially, I thought that the partition update counter is something what's persisted together with the data but I'm guessing now that this is only a part of the notification mechanism. I imagined that I would be able to implement my own deduplicaton by having 3 stages on the client side: 1. Keep processing initial query results, store their keys in memory, 2. When initial query is over, then process listener entries but before that check if they have been already delivered in the first stage, 3. When we are sure that we are already processing notifications for commits executed after initial query was done, then we can process listener entries without any additional checks (so our key set from stage 1 can be removed from memory). The problem is that I have no way to say that I can move from stage 2 to 3. Another problem is that we need to stash listener entries while still processing initial query results causing an excessive memory pressure on our client. In my case, values are immutable - I never change them, I just add new entry for newer versions. Does it mean that I won't have any duplicates between the initial query and listener entries when using continuous queries on caches supporting MVCC? After reading the related thread ( http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html) I'm now concerned about the ordering. My case assumes that there are groups of entries which belong to a business aggregate object and I would like to make sure that if I commit two records in two serial transactions then I have notifications in the same order. Those entries will have different keys so based on what you said ("we'd better to leave things as is and guarantee only per-key ordering"), it would seem that the order is not guaranteed. But do you think it would possible to guarantee order when those entries share the same affinity key and they belong to the same partition? Piotr pt., 14 gru 2018, 19:31: Denis Magda <[hidden email]> napisał(a): > Vladimir, > > Thanks for referring to the MVCC and Continuous Queries discussion, I knew > that saw us discussing a solution of the duplication problem. Let me copy > and paste it in here for others: > > 2) *Initial query*. We implemented it so that user can get some initial > > data snapshot and then start receiving events. Without MVCC we have no > > guarantees of visibility. E.g. if key is updated from V1 to V2, it is > > possible to see V2 in initial query and in event. With MVCC it is now > > technically possible to query data on certain snapshot and then receive > > only events happened after this snapshot. So that we never see V2 twice. > > Do > > you think we this feature will be interesting for our users? > > > Am I right that this would be a generic solution - whether you use Scan or > SQL query as an initial one? Have we planned it for the transactional SQL > GA or it's out of scope for now? > > -- > Denis > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <[hidden email]> > wrote: > > > [1] > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Denis, > > > > > > Not really. They are used to ensure that ordering of notifications is > > > consistent with ordering of updates, so that when a key K is updated to > > V1, > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > > duplicate notification problem in case of node failures, when the same > > > update is delivered twice. > > > > > > However, partition counters are unable to solve duplicates problem in > > > general. Essentially, the question is how to get consistent view on > some > > > data plus all notifications which happened afterwards. There are only > two > > > ways to achieve this - either lock entries during initial query, or > take > > a > > > kind of consistent data snapshot. The former was never implemented in > > > Ignite - our Scan and SQL queries do not user locking. The latter is > > > achievable in theory with MVCC. I raised that question earlier [1] (see > > > p.2), and we came to conclusion that it might be a good feature for the > > > product. It is not implemented that way for MVCC now, but most probably > > is > > > not extraordinary difficult to implement. > > > > > > Vladimir. > > > > > > [1] > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> > wrote: > > > > > >> Vladimir, > > >> > > >> The partition counter is supposed to be used internally to solve the > > >> duplication issue. Does it sound like a right approach then? > > >> > > >> What would be an approach for SQL queries? Not sure the partition > > counter > > >> is applicable. > > >> > > >> -- > > >> Denis > > >> > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > [hidden email]> > > >> wrote: > > >> > > >> > Partition counter is internal implemenattion detail, which has no > > >> sensible > > >> > meaning to end users. It should not be exposed through public API. > > >> > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> > > wrote: > > >> > > > >> > > Hello Piotr, > > >> > > > > >> > > That's a known problem and I thought a JIRA ticket already exists. > > >> > However, > > >> > > failed to locate it. The ticket for the improvement should be > > created > > >> as > > >> > a > > >> > > result of this conversation. > > >> > > > > >> > > Speaking of an initial query type, I would differentiate from > > >> ScanQueries > > >> > > and SqlQueries. For the former, it sounds reasonable to apply the > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will it > > be > > >> > > addressed as part of MVCC/Transactional SQL activities? > > >> > > > > >> > > Btw, Piotr what's your initial query type? > > >> > > > > >> > > -- > > >> > > Denis > > >> > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > >> [hidden email] > > >> > > > > >> > > wrote: > > >> > > > > >> > > > Hi, as suggested by Ilya here: > > >> > > > > > >> > > > > > >> > > > > >> > > > >> > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > >> > > > I'm resending it to the developers list. > > >> > > > > > >> > > > From that thread we know that there might be duplicates between > > >> initial > > >> > > > query results and listener entries received as part of > continuous > > >> > query. > > >> > > > That means that users need to manually dedupe data. > > >> > > > > > >> > > > In my opinion the manual deduplication in some use cases may > lead > > to > > >> > > > possible memory problems on the client side. In order to remove > > >> > > duplicated > > >> > > > notifications which we are receiving in the local listener, we > > need > > >> to > > >> > > keep > > >> > > > all initial query results in memory (or at least their unique > > ids). > > >> > > > Unfortunately, there is no way (is there?) to find a point in > time > > >> when > > >> > > we > > >> > > > can be sure that no dups will arrive anymore. That would mean > that > > >> we > > >> > > need > > >> > > > to keep that data indefinitely and use it every time a new > > >> notification > > >> > > > arrives. In case of multiple continuous queries run from a > single > > >> JVM, > > >> > > this > > >> > > > might eventually become a memory or performance problem. I can > see > > >> the > > >> > > > following possible improvements to Ignite: > > >> > > > > > >> > > > 1. The deduplication between initial query and incoming > > notification > > >> > > could > > >> > > > be done fully in Ignite. As far as I know there is already the > > >> > > > updateCounter and partition id for all the objects so it could > be > > >> used > > >> > > > internally. > > >> > > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > > listener > > >> > > after > > >> > > > query() method returns are not duplicates. This kind of > > >> functionality > > >> > > would > > >> > > > require a specific synchronization inside Ignite. It would also > > mean > > >> > that > > >> > > > the query() method cannot return before all potential duplicates > > are > > >> > > > processed by a local listener what looks wrong. > > >> > > > > > >> > > > 3. Notify users that starting from a given notification they can > > be > > >> > sure > > >> > > > they will not receive any duplicates anymore. This could be an > > >> > additional > > >> > > > boolean flag in the CacheQueryEntryEvent. > > >> > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > partitionUpdateCounter. > > >> > > > Unfortunately we don't have this information for initial query > > >> results. > > >> > > If > > >> > > > we had, a client could manually deduplicate notifications and > get > > >> rid > > >> > of > > >> > > > initial query results for a given partition after newer > > >> notifications > > >> > > > arrive. Also it would be very convenient to expose partition id > as > > >> well > > >> > > but > > >> > > > now we can figure it out using the affinity service. The > > assumption > > >> > here > > >> > > is > > >> > > > that notifications are ordered by partitionUpdateCounter (is it > > >> true?). > > >> > > > > > >> > > > Please correct me if I'm missing anything. > > >> > > > > > >> > > > What do you think? > > >> > > > > > >> > > > Piotr > > >> > > > > > >> > > > > >> > > > >> > > > > > > |
>
> In my case, values are immutable - I never change them, I just add new > entry for newer versions. Does it mean that I won't have any duplicates > between the initial query and listener entries when using continuous > queries on caches supporting MVCC? I'm afraid there still might be a race. Val, Vladimir, other Ignite experts, please confirm. After reading the related thread ( > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > ) > I'm now concerned about the ordering. My case assumes that there are groups > of entries which belong to a business aggregate object and I would like to > make sure that if I commit two records in two serial transactions then I > have notifications in the same order. Those entries will have different > keys so based on what you said ("we'd better to leave things as is and > guarantee only per-key ordering"), it would seem that the order is not > guaranteed. But do you think it would possible to guarantee order when > those entries share the same affinity key and they belong to the same > partition? The order should be the same for key-value transactions. Vladimir, could you clear out MVCC based behavior? -- Denis On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <[hidden email]> wrote: > Hi all, sorry for answering so late. > > I would like to use SqlQuery because I can leverage indexes there. > > As it was already mentioned earlier, the partition update counter is > exposed through CacheQueryEntryEvent. Initially, I thought that the > partition update counter is something what's persisted together with the > data but I'm guessing now that this is only a part of the notification > mechanism. > > I imagined that I would be able to implement my own deduplicaton by having > 3 stages on the client side: 1. Keep processing initial query results, > store their keys in memory, 2. When initial query is over, then process > listener entries but before that check if they have been already delivered > in the first stage, 3. When we are sure that we are already processing > notifications for commits executed after initial query was done, then we > can process listener entries without any additional checks (so our key set > from stage 1 can be removed from memory). The problem is that I have no way > to say that I can move from stage 2 to 3. Another problem is that we need > to stash listener entries while still processing initial query results > causing an excessive memory pressure on our client. > > In my case, values are immutable - I never change them, I just add new > entry for newer versions. Does it mean that I won't have any duplicates > between the initial query and listener entries when using continuous > queries on caches supporting MVCC? > > After reading the related thread ( > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > ) > I'm now concerned about the ordering. My case assumes that there are groups > of entries which belong to a business aggregate object and I would like to > make sure that if I commit two records in two serial transactions then I > have notifications in the same order. Those entries will have different > keys so based on what you said ("we'd better to leave things as is and > guarantee only per-key ordering"), it would seem that the order is not > guaranteed. But do you think it would possible to guarantee order when > those entries share the same affinity key and they belong to the same > partition? > > Piotr > > pt., 14 gru 2018, 19:31: Denis Magda <[hidden email]> napisał(a): > > > Vladimir, > > > > Thanks for referring to the MVCC and Continuous Queries discussion, I > knew > > that saw us discussing a solution of the duplication problem. Let me copy > > and paste it in here for others: > > > > 2) *Initial query*. We implemented it so that user can get some initial > > > data snapshot and then start receiving events. Without MVCC we have no > > > guarantees of visibility. E.g. if key is updated from V1 to V2, it is > > > possible to see V2 in initial query and in event. With MVCC it is now > > > technically possible to query data on certain snapshot and then receive > > > only events happened after this snapshot. So that we never see V2 > twice. > > > Do > > > you think we this feature will be interesting for our users? > > > > > > Am I right that this would be a generic solution - whether you use Scan > or > > SQL query as an initial one? Have we planned it for the transactional SQL > > GA or it's out of scope for now? > > > > -- > > Denis > > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <[hidden email]> > > wrote: > > > > > [1] > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > Denis, > > > > > > > > Not really. They are used to ensure that ordering of notifications is > > > > consistent with ordering of updates, so that when a key K is updated > to > > > V1, > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > > > duplicate notification problem in case of node failures, when the > same > > > > update is delivered twice. > > > > > > > > However, partition counters are unable to solve duplicates problem in > > > > general. Essentially, the question is how to get consistent view on > > some > > > > data plus all notifications which happened afterwards. There are only > > two > > > > ways to achieve this - either lock entries during initial query, or > > take > > > a > > > > kind of consistent data snapshot. The former was never implemented in > > > > Ignite - our Scan and SQL queries do not user locking. The latter is > > > > achievable in theory with MVCC. I raised that question earlier [1] > (see > > > > p.2), and we came to conclusion that it might be a good feature for > the > > > > product. It is not implemented that way for MVCC now, but most > probably > > > is > > > > not extraordinary difficult to implement. > > > > > > > > Vladimir. > > > > > > > > [1] > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> > > wrote: > > > > > > > >> Vladimir, > > > >> > > > >> The partition counter is supposed to be used internally to solve the > > > >> duplication issue. Does it sound like a right approach then? > > > >> > > > >> What would be an approach for SQL queries? Not sure the partition > > > counter > > > >> is applicable. > > > >> > > > >> -- > > > >> Denis > > > >> > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > > [hidden email]> > > > >> wrote: > > > >> > > > >> > Partition counter is internal implemenattion detail, which has no > > > >> sensible > > > >> > meaning to end users. It should not be exposed through public API. > > > >> > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email]> > > > wrote: > > > >> > > > > >> > > Hello Piotr, > > > >> > > > > > >> > > That's a known problem and I thought a JIRA ticket already > exists. > > > >> > However, > > > >> > > failed to locate it. The ticket for the improvement should be > > > created > > > >> as > > > >> > a > > > >> > > result of this conversation. > > > >> > > > > > >> > > Speaking of an initial query type, I would differentiate from > > > >> ScanQueries > > > >> > > and SqlQueries. For the former, it sounds reasonable to apply > the > > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov will > it > > > be > > > >> > > addressed as part of MVCC/Transactional SQL activities? > > > >> > > > > > >> > > Btw, Piotr what's your initial query type? > > > >> > > > > > >> > > -- > > > >> > > Denis > > > >> > > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > > >> [hidden email] > > > >> > > > > > >> > > wrote: > > > >> > > > > > >> > > > Hi, as suggested by Ilya here: > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > >> > > > I'm resending it to the developers list. > > > >> > > > > > > >> > > > From that thread we know that there might be duplicates > between > > > >> initial > > > >> > > > query results and listener entries received as part of > > continuous > > > >> > query. > > > >> > > > That means that users need to manually dedupe data. > > > >> > > > > > > >> > > > In my opinion the manual deduplication in some use cases may > > lead > > > to > > > >> > > > possible memory problems on the client side. In order to > remove > > > >> > > duplicated > > > >> > > > notifications which we are receiving in the local listener, we > > > need > > > >> to > > > >> > > keep > > > >> > > > all initial query results in memory (or at least their unique > > > ids). > > > >> > > > Unfortunately, there is no way (is there?) to find a point in > > time > > > >> when > > > >> > > we > > > >> > > > can be sure that no dups will arrive anymore. That would mean > > that > > > >> we > > > >> > > need > > > >> > > > to keep that data indefinitely and use it every time a new > > > >> notification > > > >> > > > arrives. In case of multiple continuous queries run from a > > single > > > >> JVM, > > > >> > > this > > > >> > > > might eventually become a memory or performance problem. I can > > see > > > >> the > > > >> > > > following possible improvements to Ignite: > > > >> > > > > > > >> > > > 1. The deduplication between initial query and incoming > > > notification > > > >> > > could > > > >> > > > be done fully in Ignite. As far as I know there is already the > > > >> > > > updateCounter and partition id for all the objects so it could > > be > > > >> used > > > >> > > > internally. > > > >> > > > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > > > listener > > > >> > > after > > > >> > > > query() method returns are not duplicates. This kind of > > > >> functionality > > > >> > > would > > > >> > > > require a specific synchronization inside Ignite. It would > also > > > mean > > > >> > that > > > >> > > > the query() method cannot return before all potential > duplicates > > > are > > > >> > > > processed by a local listener what looks wrong. > > > >> > > > > > > >> > > > 3. Notify users that starting from a given notification they > can > > > be > > > >> > sure > > > >> > > > they will not receive any duplicates anymore. This could be an > > > >> > additional > > > >> > > > boolean flag in the CacheQueryEntryEvent. > > > >> > > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > > partitionUpdateCounter. > > > >> > > > Unfortunately we don't have this information for initial query > > > >> results. > > > >> > > If > > > >> > > > we had, a client could manually deduplicate notifications and > > get > > > >> rid > > > >> > of > > > >> > > > initial query results for a given partition after newer > > > >> notifications > > > >> > > > arrive. Also it would be very convenient to expose partition > id > > as > > > >> well > > > >> > > but > > > >> > > > now we can figure it out using the affinity service. The > > > assumption > > > >> > here > > > >> > > is > > > >> > > > that notifications are ordered by partitionUpdateCounter (is > it > > > >> true?). > > > >> > > > > > > >> > > > Please correct me if I'm missing anything. > > > >> > > > > > > >> > > > What do you think? > > > >> > > > > > > >> > > > Piotr > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > |
Hi,
MVCC caches have the same ordering guarantees as non-MVCC caches, i.e. two subsequent updates on a single key will be delivered in proper order. There is no guarantees Order of updates on two subsequent transactions affecting the same partition may be guaranteed with current implementation (though. I am not sure), but even if it is so, I am not aware that this was ever our design goal. Most likely, this is an implementation artifact which may be changed in future. Cache experts are needed to clarify this. As far as MVCC, data anomalies are still possible in current implementation, because we didn't rework initial query handling in the first iteration, because technically this is not so simple as we thought. Once snapshot is obtained, query over that snapshot will return a data set consistent at some point in time. But the problem is that there is a time frame between snapshot acquisition and listener installation (or vice versa), what leads to either duplicates or lost entries. Some multi-step listener installation will be required here. We haven't designed it yet. Vladimir. On Mon, Dec 24, 2018 at 10:06 PM Denis Magda <[hidden email]> wrote: > > > > In my case, values are immutable - I never change them, I just add new > > entry for newer versions. Does it mean that I won't have any duplicates > > between the initial query and listener entries when using continuous > > queries on caches supporting MVCC? > > > I'm afraid there still might be a race. Val, Vladimir, other Ignite > experts, please confirm. > > After reading the related thread ( > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > ) > > I'm now concerned about the ordering. My case assumes that there are > groups > > of entries which belong to a business aggregate object and I would like > to > > make sure that if I commit two records in two serial transactions then I > > have notifications in the same order. Those entries will have different > > keys so based on what you said ("we'd better to leave things as is and > > guarantee only per-key ordering"), it would seem that the order is not > > guaranteed. But do you think it would possible to guarantee order when > > those entries share the same affinity key and they belong to the same > > partition? > > > The order should be the same for key-value transactions. Vladimir, could > you clear out MVCC based behavior? > > -- > Denis > > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <[hidden email]> > wrote: > > > Hi all, sorry for answering so late. > > > > I would like to use SqlQuery because I can leverage indexes there. > > > > As it was already mentioned earlier, the partition update counter is > > exposed through CacheQueryEntryEvent. Initially, I thought that the > > partition update counter is something what's persisted together with the > > data but I'm guessing now that this is only a part of the notification > > mechanism. > > > > I imagined that I would be able to implement my own deduplicaton by > having > > 3 stages on the client side: 1. Keep processing initial query results, > > store their keys in memory, 2. When initial query is over, then process > > listener entries but before that check if they have been already > delivered > > in the first stage, 3. When we are sure that we are already processing > > notifications for commits executed after initial query was done, then we > > can process listener entries without any additional checks (so our key > set > > from stage 1 can be removed from memory). The problem is that I have no > way > > to say that I can move from stage 2 to 3. Another problem is that we need > > to stash listener entries while still processing initial query results > > causing an excessive memory pressure on our client. > > > > In my case, values are immutable - I never change them, I just add new > > entry for newer versions. Does it mean that I won't have any duplicates > > between the initial query and listener entries when using continuous > > queries on caches supporting MVCC? > > > > After reading the related thread ( > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > ) > > I'm now concerned about the ordering. My case assumes that there are > groups > > of entries which belong to a business aggregate object and I would like > to > > make sure that if I commit two records in two serial transactions then I > > have notifications in the same order. Those entries will have different > > keys so based on what you said ("we'd better to leave things as is and > > guarantee only per-key ordering"), it would seem that the order is not > > guaranteed. But do you think it would possible to guarantee order when > > those entries share the same affinity key and they belong to the same > > partition? > > > > Piotr > > > > pt., 14 gru 2018, 19:31: Denis Magda <[hidden email]> napisał(a): > > > > > Vladimir, > > > > > > Thanks for referring to the MVCC and Continuous Queries discussion, I > > knew > > > that saw us discussing a solution of the duplication problem. Let me > copy > > > and paste it in here for others: > > > > > > 2) *Initial query*. We implemented it so that user can get some initial > > > > data snapshot and then start receiving events. Without MVCC we have > no > > > > guarantees of visibility. E.g. if key is updated from V1 to V2, it is > > > > possible to see V2 in initial query and in event. With MVCC it is now > > > > technically possible to query data on certain snapshot and then > receive > > > > only events happened after this snapshot. So that we never see V2 > > twice. > > > > Do > > > > you think we this feature will be interesting for our users? > > > > > > > > > Am I right that this would be a generic solution - whether you use Scan > > or > > > SQL query as an initial one? Have we planned it for the transactional > SQL > > > GA or it's out of scope for now? > > > > > > -- > > > Denis > > > > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > [1] > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > > > Denis, > > > > > > > > > > Not really. They are used to ensure that ordering of notifications > is > > > > > consistent with ordering of updates, so that when a key K is > updated > > to > > > > V1, > > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also solves > > > > > duplicate notification problem in case of node failures, when the > > same > > > > > update is delivered twice. > > > > > > > > > > However, partition counters are unable to solve duplicates problem > in > > > > > general. Essentially, the question is how to get consistent view on > > > some > > > > > data plus all notifications which happened afterwards. There are > only > > > two > > > > > ways to achieve this - either lock entries during initial query, or > > > take > > > > a > > > > > kind of consistent data snapshot. The former was never implemented > in > > > > > Ignite - our Scan and SQL queries do not user locking. The latter > is > > > > > achievable in theory with MVCC. I raised that question earlier [1] > > (see > > > > > p.2), and we came to conclusion that it might be a good feature for > > the > > > > > product. It is not implemented that way for MVCC now, but most > > probably > > > > is > > > > > not extraordinary difficult to implement. > > > > > > > > > > Vladimir. > > > > > > > > > > [1] > > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> > > > wrote: > > > > > > > > > >> Vladimir, > > > > >> > > > > >> The partition counter is supposed to be used internally to solve > the > > > > >> duplication issue. Does it sound like a right approach then? > > > > >> > > > > >> What would be an approach for SQL queries? Not sure the partition > > > > counter > > > > >> is applicable. > > > > >> > > > > >> -- > > > > >> Denis > > > > >> > > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > > > [hidden email]> > > > > >> wrote: > > > > >> > > > > >> > Partition counter is internal implemenattion detail, which has > no > > > > >> sensible > > > > >> > meaning to end users. It should not be exposed through public > API. > > > > >> > > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda <[hidden email] > > > > > > wrote: > > > > >> > > > > > >> > > Hello Piotr, > > > > >> > > > > > > >> > > That's a known problem and I thought a JIRA ticket already > > exists. > > > > >> > However, > > > > >> > > failed to locate it. The ticket for the improvement should be > > > > created > > > > >> as > > > > >> > a > > > > >> > > result of this conversation. > > > > >> > > > > > > >> > > Speaking of an initial query type, I would differentiate from > > > > >> ScanQueries > > > > >> > > and SqlQueries. For the former, it sounds reasonable to apply > > the > > > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov > will > > it > > > > be > > > > >> > > addressed as part of MVCC/Transactional SQL activities? > > > > >> > > > > > > >> > > Btw, Piotr what's your initial query type? > > > > >> > > > > > > >> > > -- > > > > >> > > Denis > > > > >> > > > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > > > >> [hidden email] > > > > >> > > > > > > >> > > wrote: > > > > >> > > > > > > >> > > > Hi, as suggested by Ilya here: > > > > >> > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > > >> > > > I'm resending it to the developers list. > > > > >> > > > > > > > >> > > > From that thread we know that there might be duplicates > > between > > > > >> initial > > > > >> > > > query results and listener entries received as part of > > > continuous > > > > >> > query. > > > > >> > > > That means that users need to manually dedupe data. > > > > >> > > > > > > > >> > > > In my opinion the manual deduplication in some use cases may > > > lead > > > > to > > > > >> > > > possible memory problems on the client side. In order to > > remove > > > > >> > > duplicated > > > > >> > > > notifications which we are receiving in the local listener, > we > > > > need > > > > >> to > > > > >> > > keep > > > > >> > > > all initial query results in memory (or at least their > unique > > > > ids). > > > > >> > > > Unfortunately, there is no way (is there?) to find a point > in > > > time > > > > >> when > > > > >> > > we > > > > >> > > > can be sure that no dups will arrive anymore. That would > mean > > > that > > > > >> we > > > > >> > > need > > > > >> > > > to keep that data indefinitely and use it every time a new > > > > >> notification > > > > >> > > > arrives. In case of multiple continuous queries run from a > > > single > > > > >> JVM, > > > > >> > > this > > > > >> > > > might eventually become a memory or performance problem. I > can > > > see > > > > >> the > > > > >> > > > following possible improvements to Ignite: > > > > >> > > > > > > > >> > > > 1. The deduplication between initial query and incoming > > > > notification > > > > >> > > could > > > > >> > > > be done fully in Ignite. As far as I know there is already > the > > > > >> > > > updateCounter and partition id for all the objects so it > could > > > be > > > > >> used > > > > >> > > > internally. > > > > >> > > > > > > > >> > > > 2. Add a guarantee that notifications arriving in the local > > > > listener > > > > >> > > after > > > > >> > > > query() method returns are not duplicates. This kind of > > > > >> functionality > > > > >> > > would > > > > >> > > > require a specific synchronization inside Ignite. It would > > also > > > > mean > > > > >> > that > > > > >> > > > the query() method cannot return before all potential > > duplicates > > > > are > > > > >> > > > processed by a local listener what looks wrong. > > > > >> > > > > > > > >> > > > 3. Notify users that starting from a given notification they > > can > > > > be > > > > >> > sure > > > > >> > > > they will not receive any duplicates anymore. This could be > an > > > > >> > additional > > > > >> > > > boolean flag in the CacheQueryEntryEvent. > > > > >> > > > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > > > partitionUpdateCounter. > > > > >> > > > Unfortunately we don't have this information for initial > query > > > > >> results. > > > > >> > > If > > > > >> > > > we had, a client could manually deduplicate notifications > and > > > get > > > > >> rid > > > > >> > of > > > > >> > > > initial query results for a given partition after newer > > > > >> notifications > > > > >> > > > arrive. Also it would be very convenient to expose partition > > id > > > as > > > > >> well > > > > >> > > but > > > > >> > > > now we can figure it out using the affinity service. The > > > > assumption > > > > >> > here > > > > >> > > is > > > > >> > > > that notifications are ordered by partitionUpdateCounter (is > > it > > > > >> true?). > > > > >> > > > > > > > >> > > > Please correct me if I'm missing anything. > > > > >> > > > > > > > >> > > > What do you think? > > > > >> > > > > > > > >> > > > Piotr > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > > > > > > > > > > > > |
Hi Vladimir, thank you for your response. I tested the current behaviour
and it seems that the order is maintained for notifications within a partition. Unfortunately, I don’t know how it would behave in exceptional situations like losing partitions, rebalancing etc. Do you think it would be possible to make that ordering guarantee to be a part of the Ignite API? What I would really need is to have order for notifications sharing the same affinity key, not even a partition. So I think it wouldn’t require any cross-node ordering. Thank you, Piotr śr., 9 sty 2019, 21:11: Vladimir Ozerov <[hidden email]> napisał(a): > Hi, > > MVCC caches have the same ordering guarantees as non-MVCC caches, i.e. two > subsequent updates on a single key will be delivered in proper order. There > is no guarantees Order of updates on two subsequent transactions affecting > the same partition may be guaranteed with current implementation (though. I > am not sure), but even if it is so, I am not aware that this was ever our > design goal. Most likely, this is an implementation artifact which may be > changed in future. Cache experts are needed to clarify this. > > As far as MVCC, data anomalies are still possible in current > implementation, because we didn't rework initial query handling in the > first iteration, because technically this is not so simple as we thought. > Once snapshot is obtained, query over that snapshot will return a data set > consistent at some point in time. But the problem is that there is a time > frame between snapshot acquisition and listener installation (or vice > versa), what leads to either duplicates or lost entries. Some multi-step > listener installation will be required here. We haven't designed it yet. > > Vladimir. > > > > On Mon, Dec 24, 2018 at 10:06 PM Denis Magda <[hidden email]> wrote: > > > > > > > In my case, values are immutable - I never change them, I just add new > > > entry for newer versions. Does it mean that I won't have any duplicates > > > between the initial query and listener entries when using continuous > > > queries on caches supporting MVCC? > > > > > > I'm afraid there still might be a race. Val, Vladimir, other Ignite > > experts, please confirm. > > > > After reading the related thread ( > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > ) > > > I'm now concerned about the ordering. My case assumes that there are > > groups > > > of entries which belong to a business aggregate object and I would like > > to > > > make sure that if I commit two records in two serial transactions then > I > > > have notifications in the same order. Those entries will have different > > > keys so based on what you said ("we'd better to leave things as is and > > > guarantee only per-key ordering"), it would seem that the order is not > > > guaranteed. But do you think it would possible to guarantee order when > > > those entries share the same affinity key and they belong to the same > > > partition? > > > > > > The order should be the same for key-value transactions. Vladimir, could > > you clear out MVCC based behavior? > > > > -- > > Denis > > > > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański <[hidden email] > > > > wrote: > > > > > Hi all, sorry for answering so late. > > > > > > I would like to use SqlQuery because I can leverage indexes there. > > > > > > As it was already mentioned earlier, the partition update counter is > > > exposed through CacheQueryEntryEvent. Initially, I thought that the > > > partition update counter is something what's persisted together with > the > > > data but I'm guessing now that this is only a part of the notification > > > mechanism. > > > > > > I imagined that I would be able to implement my own deduplicaton by > > having > > > 3 stages on the client side: 1. Keep processing initial query results, > > > store their keys in memory, 2. When initial query is over, then process > > > listener entries but before that check if they have been already > > delivered > > > in the first stage, 3. When we are sure that we are already processing > > > notifications for commits executed after initial query was done, then > we > > > can process listener entries without any additional checks (so our key > > set > > > from stage 1 can be removed from memory). The problem is that I have no > > way > > > to say that I can move from stage 2 to 3. Another problem is that we > need > > > to stash listener entries while still processing initial query results > > > causing an excessive memory pressure on our client. > > > > > > In my case, values are immutable - I never change them, I just add new > > > entry for newer versions. Does it mean that I won't have any duplicates > > > between the initial query and listener entries when using continuous > > > queries on caches supporting MVCC? > > > > > > After reading the related thread ( > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > ) > > > I'm now concerned about the ordering. My case assumes that there are > > groups > > > of entries which belong to a business aggregate object and I would like > > to > > > make sure that if I commit two records in two serial transactions then > I > > > have notifications in the same order. Those entries will have different > > > keys so based on what you said ("we'd better to leave things as is and > > > guarantee only per-key ordering"), it would seem that the order is not > > > guaranteed. But do you think it would possible to guarantee order when > > > those entries share the same affinity key and they belong to the same > > > partition? > > > > > > Piotr > > > > > > pt., 14 gru 2018, 19:31: Denis Magda <[hidden email]> napisał(a): > > > > > > > Vladimir, > > > > > > > > Thanks for referring to the MVCC and Continuous Queries discussion, I > > > knew > > > > that saw us discussing a solution of the duplication problem. Let me > > copy > > > > and paste it in here for others: > > > > > > > > 2) *Initial query*. We implemented it so that user can get some > initial > > > > > data snapshot and then start receiving events. Without MVCC we have > > no > > > > > guarantees of visibility. E.g. if key is updated from V1 to V2, it > is > > > > > possible to see V2 in initial query and in event. With MVCC it is > now > > > > > technically possible to query data on certain snapshot and then > > receive > > > > > only events happened after this snapshot. So that we never see V2 > > > twice. > > > > > Do > > > > > you think we this feature will be interesting for our users? > > > > > > > > > > > > Am I right that this would be a generic solution - whether you use > Scan > > > or > > > > SQL query as an initial one? Have we planned it for the transactional > > SQL > > > > GA or it's out of scope for now? > > > > > > > > -- > > > > Denis > > > > > > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > > > > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > > > Denis, > > > > > > > > > > > > Not really. They are used to ensure that ordering of > notifications > > is > > > > > > consistent with ordering of updates, so that when a key K is > > updated > > > to > > > > > V1, > > > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also > solves > > > > > > duplicate notification problem in case of node failures, when the > > > same > > > > > > update is delivered twice. > > > > > > > > > > > > However, partition counters are unable to solve duplicates > problem > > in > > > > > > general. Essentially, the question is how to get consistent view > on > > > > some > > > > > > data plus all notifications which happened afterwards. There are > > only > > > > two > > > > > > ways to achieve this - either lock entries during initial query, > or > > > > take > > > > > a > > > > > > kind of consistent data snapshot. The former was never > implemented > > in > > > > > > Ignite - our Scan and SQL queries do not user locking. The latter > > is > > > > > > achievable in theory with MVCC. I raised that question earlier > [1] > > > (see > > > > > > p.2), and we came to conclusion that it might be a good feature > for > > > the > > > > > > product. It is not implemented that way for MVCC now, but most > > > probably > > > > > is > > > > > > not extraordinary difficult to implement. > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda <[hidden email]> > > > > wrote: > > > > > > > > > > > >> Vladimir, > > > > > >> > > > > > >> The partition counter is supposed to be used internally to solve > > the > > > > > >> duplication issue. Does it sound like a right approach then? > > > > > >> > > > > > >> What would be an approach for SQL queries? Not sure the > partition > > > > > counter > > > > > >> is applicable. > > > > > >> > > > > > >> -- > > > > > >> Denis > > > > > >> > > > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > > > > [hidden email]> > > > > > >> wrote: > > > > > >> > > > > > >> > Partition counter is internal implemenattion detail, which has > > no > > > > > >> sensible > > > > > >> > meaning to end users. It should not be exposed through public > > API. > > > > > >> > > > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda < > [hidden email] > > > > > > > > wrote: > > > > > >> > > > > > > >> > > Hello Piotr, > > > > > >> > > > > > > > >> > > That's a known problem and I thought a JIRA ticket already > > > exists. > > > > > >> > However, > > > > > >> > > failed to locate it. The ticket for the improvement should > be > > > > > created > > > > > >> as > > > > > >> > a > > > > > >> > > result of this conversation. > > > > > >> > > > > > > > >> > > Speaking of an initial query type, I would differentiate > from > > > > > >> ScanQueries > > > > > >> > > and SqlQueries. For the former, it sounds reasonable to > apply > > > the > > > > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov > > will > > > it > > > > > be > > > > > >> > > addressed as part of MVCC/Transactional SQL activities? > > > > > >> > > > > > > > >> > > Btw, Piotr what's your initial query type? > > > > > >> > > > > > > > >> > > -- > > > > > >> > > Denis > > > > > >> > > > > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > > > > >> [hidden email] > > > > > >> > > > > > > > >> > > wrote: > > > > > >> > > > > > > > >> > > > Hi, as suggested by Ilya here: > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > > > >> > > > I'm resending it to the developers list. > > > > > >> > > > > > > > > >> > > > From that thread we know that there might be duplicates > > > between > > > > > >> initial > > > > > >> > > > query results and listener entries received as part of > > > > continuous > > > > > >> > query. > > > > > >> > > > That means that users need to manually dedupe data. > > > > > >> > > > > > > > > >> > > > In my opinion the manual deduplication in some use cases > may > > > > lead > > > > > to > > > > > >> > > > possible memory problems on the client side. In order to > > > remove > > > > > >> > > duplicated > > > > > >> > > > notifications which we are receiving in the local > listener, > > we > > > > > need > > > > > >> to > > > > > >> > > keep > > > > > >> > > > all initial query results in memory (or at least their > > unique > > > > > ids). > > > > > >> > > > Unfortunately, there is no way (is there?) to find a point > > in > > > > time > > > > > >> when > > > > > >> > > we > > > > > >> > > > can be sure that no dups will arrive anymore. That would > > mean > > > > that > > > > > >> we > > > > > >> > > need > > > > > >> > > > to keep that data indefinitely and use it every time a new > > > > > >> notification > > > > > >> > > > arrives. In case of multiple continuous queries run from a > > > > single > > > > > >> JVM, > > > > > >> > > this > > > > > >> > > > might eventually become a memory or performance problem. I > > can > > > > see > > > > > >> the > > > > > >> > > > following possible improvements to Ignite: > > > > > >> > > > > > > > > >> > > > 1. The deduplication between initial query and incoming > > > > > notification > > > > > >> > > could > > > > > >> > > > be done fully in Ignite. As far as I know there is already > > the > > > > > >> > > > updateCounter and partition id for all the objects so it > > could > > > > be > > > > > >> used > > > > > >> > > > internally. > > > > > >> > > > > > > > > >> > > > 2. Add a guarantee that notifications arriving in the > local > > > > > listener > > > > > >> > > after > > > > > >> > > > query() method returns are not duplicates. This kind of > > > > > >> functionality > > > > > >> > > would > > > > > >> > > > require a specific synchronization inside Ignite. It would > > > also > > > > > mean > > > > > >> > that > > > > > >> > > > the query() method cannot return before all potential > > > duplicates > > > > > are > > > > > >> > > > processed by a local listener what looks wrong. > > > > > >> > > > > > > > > >> > > > 3. Notify users that starting from a given notification > they > > > can > > > > > be > > > > > >> > sure > > > > > >> > > > they will not receive any duplicates anymore. This could > be > > an > > > > > >> > additional > > > > > >> > > > boolean flag in the CacheQueryEntryEvent. > > > > > >> > > > > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > > > > partitionUpdateCounter. > > > > > >> > > > Unfortunately we don't have this information for initial > > query > > > > > >> results. > > > > > >> > > If > > > > > >> > > > we had, a client could manually deduplicate notifications > > and > > > > get > > > > > >> rid > > > > > >> > of > > > > > >> > > > initial query results for a given partition after newer > > > > > >> notifications > > > > > >> > > > arrive. Also it would be very convenient to expose > partition > > > id > > > > as > > > > > >> well > > > > > >> > > but > > > > > >> > > > now we can figure it out using the affinity service. The > > > > > assumption > > > > > >> > here > > > > > >> > > is > > > > > >> > > > that notifications are ordered by partitionUpdateCounter > (is > > > it > > > > > >> true?). > > > > > >> > > > > > > > > >> > > > Please correct me if I'm missing anything. > > > > > >> > > > > > > > > >> > > > What do you think? > > > > > >> > > > > > > > > >> > > > Piotr > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > > > > > > > > > > > > > > > > > |
Hi Piotr,
Unfortunately I do not have answer to the question about ordering guarantees during node crashes for the same affinity key. Hopefully some other Ignite experts would be able to help. But in any case I doubt we will be able to have public guarantee on the same affinity key, as opposed to current approach (key itself), Vladimir. On Fri, Jan 11, 2019 at 5:24 PM Piotr Romański <[hidden email]> wrote: > Hi Vladimir, thank you for your response. I tested the current behaviour > and it seems that the order is maintained for notifications within a > partition. Unfortunately, I don’t know how it would behave in exceptional > situations like losing partitions, rebalancing etc. Do you think it would > be possible to make that ordering guarantee to be a part of the Ignite API? > What I would really need is to have order for notifications sharing the > same affinity key, not even a partition. So I think it wouldn’t require any > cross-node ordering. > > Thank you, > > Piotr > > śr., 9 sty 2019, 21:11: Vladimir Ozerov <[hidden email]> napisał(a): > > > Hi, > > > > MVCC caches have the same ordering guarantees as non-MVCC caches, i.e. > two > > subsequent updates on a single key will be delivered in proper order. > There > > is no guarantees Order of updates on two subsequent transactions > affecting > > the same partition may be guaranteed with current implementation > (though. I > > am not sure), but even if it is so, I am not aware that this was ever our > > design goal. Most likely, this is an implementation artifact which may be > > changed in future. Cache experts are needed to clarify this. > > > > As far as MVCC, data anomalies are still possible in current > > implementation, because we didn't rework initial query handling in the > > first iteration, because technically this is not so simple as we thought. > > Once snapshot is obtained, query over that snapshot will return a data > set > > consistent at some point in time. But the problem is that there is a time > > frame between snapshot acquisition and listener installation (or vice > > versa), what leads to either duplicates or lost entries. Some multi-step > > listener installation will be required here. We haven't designed it yet. > > > > Vladimir. > > > > > > > > On Mon, Dec 24, 2018 at 10:06 PM Denis Magda <[hidden email]> wrote: > > > > > > > > > > In my case, values are immutable - I never change them, I just add > new > > > > entry for newer versions. Does it mean that I won't have any > duplicates > > > > between the initial query and listener entries when using continuous > > > > queries on caches supporting MVCC? > > > > > > > > > I'm afraid there still might be a race. Val, Vladimir, other Ignite > > > experts, please confirm. > > > > > > After reading the related thread ( > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > ) > > > > I'm now concerned about the ordering. My case assumes that there are > > > groups > > > > of entries which belong to a business aggregate object and I would > like > > > to > > > > make sure that if I commit two records in two serial transactions > then > > I > > > > have notifications in the same order. Those entries will have > different > > > > keys so based on what you said ("we'd better to leave things as is > and > > > > guarantee only per-key ordering"), it would seem that the order is > not > > > > guaranteed. But do you think it would possible to guarantee order > when > > > > those entries share the same affinity key and they belong to the same > > > > partition? > > > > > > > > > The order should be the same for key-value transactions. Vladimir, > could > > > you clear out MVCC based behavior? > > > > > > -- > > > Denis > > > > > > On Mon, Dec 17, 2018 at 9:55 AM Piotr Romański < > [hidden email] > > > > > > wrote: > > > > > > > Hi all, sorry for answering so late. > > > > > > > > I would like to use SqlQuery because I can leverage indexes there. > > > > > > > > As it was already mentioned earlier, the partition update counter is > > > > exposed through CacheQueryEntryEvent. Initially, I thought that the > > > > partition update counter is something what's persisted together with > > the > > > > data but I'm guessing now that this is only a part of the > notification > > > > mechanism. > > > > > > > > I imagined that I would be able to implement my own deduplicaton by > > > having > > > > 3 stages on the client side: 1. Keep processing initial query > results, > > > > store their keys in memory, 2. When initial query is over, then > process > > > > listener entries but before that check if they have been already > > > delivered > > > > in the first stage, 3. When we are sure that we are already > processing > > > > notifications for commits executed after initial query was done, then > > we > > > > can process listener entries without any additional checks (so our > key > > > set > > > > from stage 1 can be removed from memory). The problem is that I have > no > > > way > > > > to say that I can move from stage 2 to 3. Another problem is that we > > need > > > > to stash listener entries while still processing initial query > results > > > > causing an excessive memory pressure on our client. > > > > > > > > In my case, values are immutable - I never change them, I just add > new > > > > entry for newer versions. Does it mean that I won't have any > duplicates > > > > between the initial query and listener entries when using continuous > > > > queries on caches supporting MVCC? > > > > > > > > After reading the related thread ( > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > ) > > > > I'm now concerned about the ordering. My case assumes that there are > > > groups > > > > of entries which belong to a business aggregate object and I would > like > > > to > > > > make sure that if I commit two records in two serial transactions > then > > I > > > > have notifications in the same order. Those entries will have > different > > > > keys so based on what you said ("we'd better to leave things as is > and > > > > guarantee only per-key ordering"), it would seem that the order is > not > > > > guaranteed. But do you think it would possible to guarantee order > when > > > > those entries share the same affinity key and they belong to the same > > > > partition? > > > > > > > > Piotr > > > > > > > > pt., 14 gru 2018, 19:31: Denis Magda <[hidden email]> napisał(a): > > > > > > > > > Vladimir, > > > > > > > > > > Thanks for referring to the MVCC and Continuous Queries > discussion, I > > > > knew > > > > > that saw us discussing a solution of the duplication problem. Let > me > > > copy > > > > > and paste it in here for others: > > > > > > > > > > 2) *Initial query*. We implemented it so that user can get some > > initial > > > > > > data snapshot and then start receiving events. Without MVCC we > have > > > no > > > > > > guarantees of visibility. E.g. if key is updated from V1 to V2, > it > > is > > > > > > possible to see V2 in initial query and in event. With MVCC it is > > now > > > > > > technically possible to query data on certain snapshot and then > > > receive > > > > > > only events happened after this snapshot. So that we never see V2 > > > > twice. > > > > > > Do > > > > > > you think we this feature will be interesting for our users? > > > > > > > > > > > > > > > Am I right that this would be a generic solution - whether you use > > Scan > > > > or > > > > > SQL query as an initial one? Have we planned it for the > transactional > > > SQL > > > > > GA or it's out of scope for now? > > > > > > > > > > -- > > > > > Denis > > > > > > > > > > On Thu, Dec 13, 2018 at 12:40 PM Vladimir Ozerov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html > > > > > > > > > > > > On Thu, Dec 13, 2018 at 11:38 PM Vladimir Ozerov < > > > [hidden email] > > > > > > > > > > > wrote: > > > > > > > > > > > > > Denis, > > > > > > > > > > > > > > Not really. They are used to ensure that ordering of > > notifications > > > is > > > > > > > consistent with ordering of updates, so that when a key K is > > > updated > > > > to > > > > > > V1, > > > > > > > then V2, then V3, you never observe V1 -> V3 -> V2. It also > > solves > > > > > > > duplicate notification problem in case of node failures, when > the > > > > same > > > > > > > update is delivered twice. > > > > > > > > > > > > > > However, partition counters are unable to solve duplicates > > problem > > > in > > > > > > > general. Essentially, the question is how to get consistent > view > > on > > > > > some > > > > > > > data plus all notifications which happened afterwards. There > are > > > only > > > > > two > > > > > > > ways to achieve this - either lock entries during initial > query, > > or > > > > > take > > > > > > a > > > > > > > kind of consistent data snapshot. The former was never > > implemented > > > in > > > > > > > Ignite - our Scan and SQL queries do not user locking. The > latter > > > is > > > > > > > achievable in theory with MVCC. I raised that question earlier > > [1] > > > > (see > > > > > > > p.2), and we came to conclusion that it might be a good feature > > for > > > > the > > > > > > > product. It is not implemented that way for MVCC now, but most > > > > probably > > > > > > is > > > > > > > not extraordinary difficult to implement. > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > http://apache-ignite-developers.2346864.n4.nabble.com/Continuous-queries-and-MVCC-td33972.html#a33998 > > > > > > > > > > > > > > On Thu, Dec 13, 2018 at 11:17 PM Denis Magda < > [hidden email]> > > > > > wrote: > > > > > > > > > > > > > >> Vladimir, > > > > > > >> > > > > > > >> The partition counter is supposed to be used internally to > solve > > > the > > > > > > >> duplication issue. Does it sound like a right approach then? > > > > > > >> > > > > > > >> What would be an approach for SQL queries? Not sure the > > partition > > > > > > counter > > > > > > >> is applicable. > > > > > > >> > > > > > > >> -- > > > > > > >> Denis > > > > > > >> > > > > > > >> On Thu, Dec 13, 2018 at 11:16 AM Vladimir Ozerov < > > > > > [hidden email]> > > > > > > >> wrote: > > > > > > >> > > > > > > >> > Partition counter is internal implemenattion detail, which > has > > > no > > > > > > >> sensible > > > > > > >> > meaning to end users. It should not be exposed through > public > > > API. > > > > > > >> > > > > > > > >> > On Thu, Dec 13, 2018 at 10:14 PM Denis Magda < > > [hidden email] > > > > > > > > > > wrote: > > > > > > >> > > > > > > > >> > > Hello Piotr, > > > > > > >> > > > > > > > > >> > > That's a known problem and I thought a JIRA ticket already > > > > exists. > > > > > > >> > However, > > > > > > >> > > failed to locate it. The ticket for the improvement should > > be > > > > > > created > > > > > > >> as > > > > > > >> > a > > > > > > >> > > result of this conversation. > > > > > > >> > > > > > > > > >> > > Speaking of an initial query type, I would differentiate > > from > > > > > > >> ScanQueries > > > > > > >> > > and SqlQueries. For the former, it sounds reasonable to > > apply > > > > the > > > > > > >> > > partitionCounter logic. As for the latter, Vladimir Ozerov > > > will > > > > it > > > > > > be > > > > > > >> > > addressed as part of MVCC/Transactional SQL activities? > > > > > > >> > > > > > > > > >> > > Btw, Piotr what's your initial query type? > > > > > > >> > > > > > > > > >> > > -- > > > > > > >> > > Denis > > > > > > >> > > > > > > > > >> > > On Thu, Dec 13, 2018 at 3:28 AM Piotr Romański < > > > > > > >> [hidden email] > > > > > > >> > > > > > > > > >> > > wrote: > > > > > > >> > > > > > > > > >> > > > Hi, as suggested by Ilya here: > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > http://apache-ignite-users.70518.x6.nabble.com/Continuous-queries-and-duplicates-td25314.html > > > > > > >> > > > I'm resending it to the developers list. > > > > > > >> > > > > > > > > > >> > > > From that thread we know that there might be duplicates > > > > between > > > > > > >> initial > > > > > > >> > > > query results and listener entries received as part of > > > > > continuous > > > > > > >> > query. > > > > > > >> > > > That means that users need to manually dedupe data. > > > > > > >> > > > > > > > > > >> > > > In my opinion the manual deduplication in some use cases > > may > > > > > lead > > > > > > to > > > > > > >> > > > possible memory problems on the client side. In order to > > > > remove > > > > > > >> > > duplicated > > > > > > >> > > > notifications which we are receiving in the local > > listener, > > > we > > > > > > need > > > > > > >> to > > > > > > >> > > keep > > > > > > >> > > > all initial query results in memory (or at least their > > > unique > > > > > > ids). > > > > > > >> > > > Unfortunately, there is no way (is there?) to find a > point > > > in > > > > > time > > > > > > >> when > > > > > > >> > > we > > > > > > >> > > > can be sure that no dups will arrive anymore. That would > > > mean > > > > > that > > > > > > >> we > > > > > > >> > > need > > > > > > >> > > > to keep that data indefinitely and use it every time a > new > > > > > > >> notification > > > > > > >> > > > arrives. In case of multiple continuous queries run > from a > > > > > single > > > > > > >> JVM, > > > > > > >> > > this > > > > > > >> > > > might eventually become a memory or performance > problem. I > > > can > > > > > see > > > > > > >> the > > > > > > >> > > > following possible improvements to Ignite: > > > > > > >> > > > > > > > > > >> > > > 1. The deduplication between initial query and incoming > > > > > > notification > > > > > > >> > > could > > > > > > >> > > > be done fully in Ignite. As far as I know there is > already > > > the > > > > > > >> > > > updateCounter and partition id for all the objects so it > > > could > > > > > be > > > > > > >> used > > > > > > >> > > > internally. > > > > > > >> > > > > > > > > > >> > > > 2. Add a guarantee that notifications arriving in the > > local > > > > > > listener > > > > > > >> > > after > > > > > > >> > > > query() method returns are not duplicates. This kind of > > > > > > >> functionality > > > > > > >> > > would > > > > > > >> > > > require a specific synchronization inside Ignite. It > would > > > > also > > > > > > mean > > > > > > >> > that > > > > > > >> > > > the query() method cannot return before all potential > > > > duplicates > > > > > > are > > > > > > >> > > > processed by a local listener what looks wrong. > > > > > > >> > > > > > > > > > >> > > > 3. Notify users that starting from a given notification > > they > > > > can > > > > > > be > > > > > > >> > sure > > > > > > >> > > > they will not receive any duplicates anymore. This could > > be > > > an > > > > > > >> > additional > > > > > > >> > > > boolean flag in the CacheQueryEntryEvent. > > > > > > >> > > > > > > > > > >> > > > 4. CacheQueryEntryEvent already exposes the > > > > > > partitionUpdateCounter. > > > > > > >> > > > Unfortunately we don't have this information for initial > > > query > > > > > > >> results. > > > > > > >> > > If > > > > > > >> > > > we had, a client could manually deduplicate > notifications > > > and > > > > > get > > > > > > >> rid > > > > > > >> > of > > > > > > >> > > > initial query results for a given partition after newer > > > > > > >> notifications > > > > > > >> > > > arrive. Also it would be very convenient to expose > > partition > > > > id > > > > > as > > > > > > >> well > > > > > > >> > > but > > > > > > >> > > > now we can figure it out using the affinity service. The > > > > > > assumption > > > > > > >> > here > > > > > > >> > > is > > > > > > >> > > > that notifications are ordered by partitionUpdateCounter > > (is > > > > it > > > > > > >> true?). > > > > > > >> > > > > > > > > > >> > > > Please correct me if I'm missing anything. > > > > > > >> > > > > > > > > > >> > > > What do you think? > > > > > > >> > > > > > > > > > >> > > > Piotr > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |