Folks,
I received a number of complaints from users that our default setting favor performance at the cost of correctness and subtle behavior. Yesterday I faced one such situation on my own. I started REPLICATED cache on several nodes, put some data, executed simple SQL and got wrong result. No errors, no warnings. The problem was caused by default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! Another widely known examples are data streamer behavior, "read form backups" + continuous queries. I propose to change our defaults to favor *correctness* over performance, and create good documentation and JavaDocs to explain users how to tune our product. Proposed changes: 1) FULL_SYNC as default; 2) "readFromBackups=false" as default; 3) "IgniteDataStreamer.allowOverwrite=true" as default. Users should not think how to make Ignite work correctly. It should be correct out of the box. Vladimir. |
Vladimir,
What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? D. On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <[hidden email]> wrote: > Folks, > > I received a number of complaints from users that our default setting favor > performance at the cost of correctness and subtle behavior. Yesterday I > faced one such situation on my own. > > I started REPLICATED cache on several nodes, put some data, executed simple > SQL and got wrong result. No errors, no warnings. The problem was caused by > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > Another widely known examples are data streamer behavior, "read form > backups" + continuous queries. > > I propose to change our defaults to favor *correctness* over performance, > and create good documentation and JavaDocs to explain users how to tune our > product. Proposed changes: > > 1) FULL_SYNC as default; > 2) "readFromBackups=false" as default; > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > Users should not think how to make Ignite work correctly. It should be > correct out of the box. > > Vladimir. > |
With replicated cache we can execute a query against backup partitions that
were not updated yet because of PRIMARY_SYNC. Thus we do not see an update. Sergi 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > Vladimir, > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > D. > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Folks, > > > > I received a number of complaints from users that our default setting > favor > > performance at the cost of correctness and subtle behavior. Yesterday I > > faced one such situation on my own. > > > > I started REPLICATED cache on several nodes, put some data, executed > simple > > SQL and got wrong result. No errors, no warnings. The problem was caused > by > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > > > Another widely known examples are data streamer behavior, "read form > > backups" + continuous queries. > > > > I propose to change our defaults to favor *correctness* over performance, > > and create good documentation and JavaDocs to explain users how to tune > our > > product. Proposed changes: > > > > 1) FULL_SYNC as default; > > 2) "readFromBackups=false" as default; > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > Users should not think how to make Ignite work correctly. It should be > > correct out of the box. > > > > Vladimir. > > > |
This sounds more like an issue with query execution, rather than wrong
PRIMARY_SYNC behavior. We already had a discussion about this optimization in replicated cache and decided to switch it off by default. -Val On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <[hidden email]> wrote: > With replicated cache we can execute a query against backup partitions that > were not updated yet because of PRIMARY_SYNC. Thus we do not see an update. > > Sergi > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > Vladimir, > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > D. > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Folks, > > > > > > I received a number of complaints from users that our default setting > > favor > > > performance at the cost of correctness and subtle behavior. Yesterday I > > > faced one such situation on my own. > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > simple > > > SQL and got wrong result. No errors, no warnings. The problem was > caused > > by > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > > > > > Another widely known examples are data streamer behavior, "read form > > > backups" + continuous queries. > > > > > > I propose to change our defaults to favor *correctness* over > performance, > > > and create good documentation and JavaDocs to explain users how to tune > > our > > > product. Proposed changes: > > > > > > 1) FULL_SYNC as default; > > > 2) "readFromBackups=false" as default; > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > Users should not think how to make Ignite work correctly. It should be > > > correct out of the box. > > > > > > Vladimir. > > > > > > |
Val,
I'm not sure I understand what optimization you are talking about and what exactly did you decide to switch off, can you explain please? Sergi 2017-04-18 10:42 GMT+03:00 Valentin Kulichenko < [hidden email]>: > This sounds more like an issue with query execution, rather than wrong > PRIMARY_SYNC > behavior. We already had a discussion about this optimization in replicated > cache and decided to switch it off by default. > > -Val > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <[hidden email]> > wrote: > > > With replicated cache we can execute a query against backup partitions > that > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > update. > > > > Sergi > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > > > Vladimir, > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > D. > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > [hidden email]> > > > wrote: > > > > > > > Folks, > > > > > > > > I received a number of complaints from users that our default setting > > > favor > > > > performance at the cost of correctness and subtle behavior. > Yesterday I > > > > faced one such situation on my own. > > > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > > simple > > > > SQL and got wrong result. No errors, no warnings. The problem was > > caused > > > by > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > box! > > > > > > > > Another widely known examples are data streamer behavior, "read form > > > > backups" + continuous queries. > > > > > > > > I propose to change our defaults to favor *correctness* over > > performance, > > > > and create good documentation and JavaDocs to explain users how to > tune > > > our > > > > product. Proposed changes: > > > > > > > > 1) FULL_SYNC as default; > > > > 2) "readFromBackups=false" as default; > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > Users should not think how to make Ignite work correctly. It should > be > > > > correct out of the box. > > > > > > > > Vladimir. > > > > > > > > > > |
In reply to this post by Valentin Kulichenko
Val,
PRIMARY_SYNC doesn't work correctly with the most common case of SQL query execution over REPLICATED cache. Also it has weird consequences for continuous queries when coupled with another performance-over-correctness property "readFromBackup=true": user may receive CQ notification with new value, but subsequent GET on local node may return old value. On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < [hidden email]> wrote: > This sounds more like an issue with query execution, rather than wrong > PRIMARY_SYNC > behavior. We already had a discussion about this optimization in replicated > cache and decided to switch it off by default. > > -Val > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin <[hidden email]> > wrote: > > > With replicated cache we can execute a query against backup partitions > that > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > update. > > > > Sergi > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > > > Vladimir, > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > D. > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > [hidden email]> > > > wrote: > > > > > > > Folks, > > > > > > > > I received a number of complaints from users that our default setting > > > favor > > > > performance at the cost of correctness and subtle behavior. > Yesterday I > > > > faced one such situation on my own. > > > > > > > > I started REPLICATED cache on several nodes, put some data, executed > > > simple > > > > SQL and got wrong result. No errors, no warnings. The problem was > > caused > > > by > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > box! > > > > > > > > Another widely known examples are data streamer behavior, "read form > > > > backups" + continuous queries. > > > > > > > > I propose to change our defaults to favor *correctness* over > > performance, > > > > and create good documentation and JavaDocs to explain users how to > tune > > > our > > > > product. Proposed changes: > > > > > > > > 1) FULL_SYNC as default; > > > > 2) "readFromBackups=false" as default; > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > Users should not think how to make Ignite work correctly. It should > be > > > > correct out of the box. > > > > > > > > Vladimir. > > > > > > > > > > |
Sergi,
I'm talking about this discussion: http://apache-ignite-developers.2346864.n4.nabble.com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html -Val On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <[hidden email]> wrote: > Val, > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL query > execution over REPLICATED cache. Also it has weird consequences for > continuous queries when coupled with another performance-over-correctness > property "readFromBackup=true": user may receive CQ notification with new > value, but subsequent GET on local node may return old value. > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > [hidden email]> wrote: > > > This sounds more like an issue with query execution, rather than wrong > > PRIMARY_SYNC > > behavior. We already had a discussion about this optimization in > replicated > > cache and decided to switch it off by default. > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > [hidden email]> > > wrote: > > > > > With replicated cache we can execute a query against backup partitions > > that > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > update. > > > > > > Sergi > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > > > > > Vladimir, > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > > > D. > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > [hidden email]> > > > > wrote: > > > > > > > > > Folks, > > > > > > > > > > I received a number of complaints from users that our default > setting > > > > favor > > > > > performance at the cost of correctness and subtle behavior. > > Yesterday I > > > > > faced one such situation on my own. > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > executed > > > > simple > > > > > SQL and got wrong result. No errors, no warnings. The problem was > > > caused > > > > by > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > > box! > > > > > > > > > > Another widely known examples are data streamer behavior, "read > form > > > > > backups" + continuous queries. > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > performance, > > > > > and create good documentation and JavaDocs to explain users how to > > tune > > > > our > > > > > product. Proposed changes: > > > > > > > > > > 1) FULL_SYNC as default; > > > > > 2) "readFromBackups=false" as default; > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > Users should not think how to make Ignite work correctly. It should > > be > > > > > correct out of the box. > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > |
Val,
That discussion has nothing to do with this PRIMARY_SYNC problem. Sergi 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < [hidden email]>: > Sergi, > > I'm talking about this discussion: > http://apache-ignite-developers.2346864.n4.nabble. > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > -Val > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Val, > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > query > > execution over REPLICATED cache. Also it has weird consequences for > > continuous queries when coupled with another performance-over-correctness > > property "readFromBackup=true": user may receive CQ notification with new > > value, but subsequent GET on local node may return old value. > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > [hidden email]> wrote: > > > > > This sounds more like an issue with query execution, rather than wrong > > > PRIMARY_SYNC > > > behavior. We already had a discussion about this optimization in > > replicated > > > cache and decided to switch it off by default. > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > [hidden email]> > > > wrote: > > > > > > > With replicated cache we can execute a query against backup > partitions > > > that > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > > update. > > > > > > > > Sergi > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan <[hidden email] > >: > > > > > > > > > Vladimir, > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it work? > > > > > > > > > > D. > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > [hidden email]> > > > > > wrote: > > > > > > > > > > > Folks, > > > > > > > > > > > > I received a number of complaints from users that our default > > setting > > > > > favor > > > > > > performance at the cost of correctness and subtle behavior. > > > Yesterday I > > > > > > faced one such situation on my own. > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > executed > > > > > simple > > > > > > SQL and got wrong result. No errors, no warnings. The problem was > > > > caused > > > > > by > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the > > > box! > > > > > > > > > > > > Another widely known examples are data streamer behavior, "read > > form > > > > > > backups" + continuous queries. > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > performance, > > > > > > and create good documentation and JavaDocs to explain users how > to > > > tune > > > > > our > > > > > > product. Proposed changes: > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > 2) "readFromBackups=false" as default; > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > should > > > be > > > > > > correct out of the box. > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > |
Can you please elaborate then? What is the logic there?
-Val On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin <[hidden email]> wrote: > Val, > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > Sergi > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > > > Sergi, > > > > I'm talking about this discussion: > > http://apache-ignite-developers.2346864.n4.nabble. > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > Val, > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > > query > > > execution over REPLICATED cache. Also it has weird consequences for > > > continuous queries when coupled with another > performance-over-correctness > > > property "readFromBackup=true": user may receive CQ notification with > new > > > value, but subsequent GET on local node may return old value. > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > [hidden email]> wrote: > > > > > > > This sounds more like an issue with query execution, rather than > wrong > > > > PRIMARY_SYNC > > > > behavior. We already had a discussion about this optimization in > > > replicated > > > > cache and decided to switch it off by default. > > > > > > > > -Val > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > [hidden email]> > > > > wrote: > > > > > > > > > With replicated cache we can execute a query against backup > > partitions > > > > that > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see an > > > > update. > > > > > > > > > > Sergi > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > [hidden email] > > >: > > > > > > > > > > > Vladimir, > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > work? > > > > > > > > > > > > D. > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > I received a number of complaints from users that our default > > > setting > > > > > > favor > > > > > > > performance at the cost of correctness and subtle behavior. > > > > Yesterday I > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > executed > > > > > > simple > > > > > > > SQL and got wrong result. No errors, no warnings. The problem > was > > > > > caused > > > > > > by > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of > the > > > > box! > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, "read > > > form > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > > performance, > > > > > > > and create good documentation and JavaDocs to explain users how > > to > > > > tune > > > > > > our > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > > should > > > > be > > > > > > > correct out of the box. > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Val,
There we were not able to run queries against partitioned tables using replicated cache API (I already fixed that in master). Here we are talking about query result inconsistency in case of PRIMARY_SYNC because of async backup update. Sergi 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < [hidden email]>: > Can you please elaborate then? What is the logic there? > > -Val > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin <[hidden email]> > wrote: > > > Val, > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > Sergi > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > [hidden email]>: > > > > > Sergi, > > > > > > I'm talking about this discussion: > > > http://apache-ignite-developers.2346864.n4.nabble. > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov <[hidden email] > > > > > wrote: > > > > > > > Val, > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of SQL > > > query > > > > execution over REPLICATED cache. Also it has weird consequences for > > > > continuous queries when coupled with another > > performance-over-correctness > > > > property "readFromBackup=true": user may receive CQ notification with > > new > > > > value, but subsequent GET on local node may return old value. > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > [hidden email]> wrote: > > > > > > > > > This sounds more like an issue with query execution, rather than > > wrong > > > > > PRIMARY_SYNC > > > > > behavior. We already had a discussion about this optimization in > > > > replicated > > > > > cache and decided to switch it off by default. > > > > > > > > > > -Val > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > [hidden email]> > > > > > wrote: > > > > > > > > > > > With replicated cache we can execute a query against backup > > > partitions > > > > > that > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not see > an > > > > > update. > > > > > > > > > > > > Sergi > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > [hidden email] > > > >: > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > > work? > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > I received a number of complaints from users that our default > > > > setting > > > > > > > favor > > > > > > > > performance at the cost of correctness and subtle behavior. > > > > > Yesterday I > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > > executed > > > > > > > simple > > > > > > > > SQL and got wrong result. No errors, no warnings. The problem > > was > > > > > > caused > > > > > > > by > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of > > the > > > > > box! > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > "read > > > > form > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* over > > > > > > performance, > > > > > > > > and create good documentation and JavaDocs to explain users > how > > > to > > > > > tune > > > > > > > our > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. It > > > should > > > > > be > > > > > > > > correct out of the box. > > > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
Sergi, I am confused. If we don't read from backups, then why do we care
about sync or async backup updates? On Tue, Apr 18, 2017 at 1:11 AM, Sergi Vladykin <[hidden email]> wrote: > Val, > > There we were not able to run queries against partitioned tables using > replicated cache API (I already fixed that in master). > > Here we are talking about query result inconsistency in case of > PRIMARY_SYNC > because of async backup update. > > Sergi > > 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > > > Can you please elaborate then? What is the logic there? > > > > -Val > > > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin < > [hidden email]> > > wrote: > > > > > Val, > > > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > > > Sergi > > > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > > [hidden email]>: > > > > > > > Sergi, > > > > > > > > I'm talking about this discussion: > > > > http://apache-ignite-developers.2346864.n4.nabble. > > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > > > -Val > > > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov < > [hidden email] > > > > > > > wrote: > > > > > > > > > Val, > > > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of > SQL > > > > query > > > > > execution over REPLICATED cache. Also it has weird consequences for > > > > > continuous queries when coupled with another > > > performance-over-correctness > > > > > property "readFromBackup=true": user may receive CQ notification > with > > > new > > > > > value, but subsequent GET on local node may return old value. > > > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > > [hidden email]> wrote: > > > > > > > > > > > This sounds more like an issue with query execution, rather than > > > wrong > > > > > > PRIMARY_SYNC > > > > > > behavior. We already had a discussion about this optimization in > > > > > replicated > > > > > > cache and decided to switch it off by default. > > > > > > > > > > > > -Val > > > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > > [hidden email]> > > > > > > wrote: > > > > > > > > > > > > > With replicated cache we can execute a query against backup > > > > partitions > > > > > > that > > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not > see > > an > > > > > > update. > > > > > > > > > > > > > > Sergi > > > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > > [hidden email] > > > > >: > > > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't it > > > work? > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > > [hidden email]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > I received a number of complaints from users that our > default > > > > > setting > > > > > > > > favor > > > > > > > > > performance at the cost of correctness and subtle behavior. > > > > > > Yesterday I > > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some data, > > > > > executed > > > > > > > > simple > > > > > > > > > SQL and got wrong result. No errors, no warnings. The > problem > > > was > > > > > > > caused > > > > > > > > by > > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work out > of > > > the > > > > > > box! > > > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > > "read > > > > > form > > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* > over > > > > > > > performance, > > > > > > > > > and create good documentation and JavaDocs to explain users > > how > > > > to > > > > > > tune > > > > > > > > our > > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. > It > > > > should > > > > > > be > > > > > > > > > correct out of the box. > > > > > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
We never read from backups on partitioned cache, but for replicated we do
that to be able to execute the whole query on single node locally. Sergi 2017-04-18 12:07 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > Sergi, I am confused. If we don't read from backups, then why do we care > about sync or async backup updates? > > On Tue, Apr 18, 2017 at 1:11 AM, Sergi Vladykin <[hidden email]> > wrote: > > > Val, > > > > There we were not able to run queries against partitioned tables using > > replicated cache API (I already fixed that in master). > > > > Here we are talking about query result inconsistency in case of > > PRIMARY_SYNC > > because of async backup update. > > > > Sergi > > > > 2017-04-18 11:04 GMT+03:00 Valentin Kulichenko < > > [hidden email]>: > > > > > Can you please elaborate then? What is the logic there? > > > > > > -Val > > > > > > On Tue, Apr 18, 2017 at 9:55 AM, Sergi Vladykin < > > [hidden email]> > > > wrote: > > > > > > > Val, > > > > > > > > That discussion has nothing to do with this PRIMARY_SYNC problem. > > > > > > > > Sergi > > > > > > > > 2017-04-18 10:51 GMT+03:00 Valentin Kulichenko < > > > > [hidden email]>: > > > > > > > > > Sergi, > > > > > > > > > > I'm talking about this discussion: > > > > > http://apache-ignite-developers.2346864.n4.nabble. > > > > > com/SQL-on-PARTITIONED-vs-REPLICATED-cache-td16478.html > > > > > > > > > > -Val > > > > > > > > > > On Tue, Apr 18, 2017 at 9:46 AM, Vladimir Ozerov < > > [hidden email] > > > > > > > > > wrote: > > > > > > > > > > > Val, > > > > > > > > > > > > PRIMARY_SYNC doesn't work correctly with the most common case of > > SQL > > > > > query > > > > > > execution over REPLICATED cache. Also it has weird consequences > for > > > > > > continuous queries when coupled with another > > > > performance-over-correctness > > > > > > property "readFromBackup=true": user may receive CQ notification > > with > > > > new > > > > > > value, but subsequent GET on local node may return old value. > > > > > > > > > > > > On Tue, Apr 18, 2017 at 10:42 AM, Valentin Kulichenko < > > > > > > [hidden email]> wrote: > > > > > > > > > > > > > This sounds more like an issue with query execution, rather > than > > > > wrong > > > > > > > PRIMARY_SYNC > > > > > > > behavior. We already had a discussion about this optimization > in > > > > > > replicated > > > > > > > cache and decided to switch it off by default. > > > > > > > > > > > > > > -Val > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 9:35 AM, Sergi Vladykin < > > > > > > [hidden email]> > > > > > > > wrote: > > > > > > > > > > > > > > > With replicated cache we can execute a query against backup > > > > > partitions > > > > > > > that > > > > > > > > were not updated yet because of PRIMARY_SYNC. Thus we do not > > see > > > an > > > > > > > update. > > > > > > > > > > > > > > > > Sergi > > > > > > > > > > > > > > > > 2017-04-18 10:30 GMT+03:00 Dmitriy Setrakyan < > > > > [hidden email] > > > > > >: > > > > > > > > > > > > > > > > > Vladimir, > > > > > > > > > > > > > > > > > > What is wrong with a query in PRIMARY_SYNC mode? Why won't > it > > > > work? > > > > > > > > > > > > > > > > > > D. > > > > > > > > > > > > > > > > > > On Tue, Apr 18, 2017 at 12:25 AM, Vladimir Ozerov < > > > > > > > [hidden email]> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > > > > > I received a number of complaints from users that our > > default > > > > > > setting > > > > > > > > > favor > > > > > > > > > > performance at the cost of correctness and subtle > behavior. > > > > > > > Yesterday I > > > > > > > > > > faced one such situation on my own. > > > > > > > > > > > > > > > > > > > > I started REPLICATED cache on several nodes, put some > data, > > > > > > executed > > > > > > > > > simple > > > > > > > > > > SQL and got wrong result. No errors, no warnings. The > > problem > > > > was > > > > > > > > caused > > > > > > > > > by > > > > > > > > > > default PRIMARY_SYNC mode. WTF, our cache doesn't work > out > > of > > > > the > > > > > > > box! > > > > > > > > > > > > > > > > > > > > Another widely known examples are data streamer behavior, > > > "read > > > > > > form > > > > > > > > > > backups" + continuous queries. > > > > > > > > > > > > > > > > > > > > I propose to change our defaults to favor *correctness* > > over > > > > > > > > performance, > > > > > > > > > > and create good documentation and JavaDocs to explain > users > > > how > > > > > to > > > > > > > tune > > > > > > > > > our > > > > > > > > > > product. Proposed changes: > > > > > > > > > > > > > > > > > > > > 1) FULL_SYNC as default; > > > > > > > > > > 2) "readFromBackups=false" as default; > > > > > > > > > > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > > > > > > > > > > > > > > > > > > > Users should not think how to make Ignite work correctly. > > It > > > > > should > > > > > > > be > > > > > > > > > > correct out of the box. > > > > > > > > > > > > > > > > > > > > Vladimir. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > |
On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin <[hidden email]>
wrote: > We never read from backups on partitioned cache, but for replicated we do > that to be able to execute the whole query on single node locally.\ > But I thought that we agreed to change that behavior and have REPLICATED cache work the same as PARTITIONED. I think Valentin provided a link to the discussion we had on the dev list. I would not make FULL_SYNC as default, but I would definitely fix this behavior for the REPLICATED caches. D. |
It only means that we will parse the query always and check if it contains
only replicated tables or not. If it does, then we execute the query on a single node across all the partitions. Sergi 2017-04-18 12:26 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin <[hidden email]> > wrote: > > > We never read from backups on partitioned cache, but for replicated we do > > that to be able to execute the whole query on single node locally.\ > > > > But I thought that we agreed to change that behavior and have REPLICATED > cache work the same as PARTITIONED. I think Valentin provided a link to the > discussion we had on the dev list. > > I would not make FULL_SYNC as default, but I would definitely fix this > behavior for the REPLICATED caches. > > D. > |
In reply to this post by dsetrakyan
Dima,
If change behavior of REPLICATED caches this way, instead of nice scalability out of the box, users will have slow distributed joins by default. All we need to do is to set strict FULL_SYNC as default. On Tue, Apr 18, 2017 at 12:26 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Tue, Apr 18, 2017 at 2:21 AM, Sergi Vladykin <[hidden email]> > wrote: > > > We never read from backups on partitioned cache, but for replicated we do > > that to be able to execute the whole query on single node locally.\ > > > > But I thought that we agreed to change that behavior and have REPLICATED > cache work the same as PARTITIONED. I think Valentin provided a link to the > discussion we had on the dev list. > > I would not make FULL_SYNC as default, but I would definitely fix this > behavior for the REPLICATED caches. > > D. > |
Guys, what if we look at this from another point - we can switch to read
from primary only if there is any primary_sync operation that is not acked by backups yet. Or we can wait until all operations of the kind are acked and then proceed with query. This seems to work when we have query after sequence of puts, but fails if we have sequence of puts then compute job spawning a query from remote node. And this seems to bring lots of complications to cache update protocol. Given this I would vote for switching default (probably, for replicated cache only) to full_sync and output a performance warning. However, there is still an open question - how can I guarantee query consistency with primary_sync? --Yakov |
Yakov,
The idea of tracking current operations and wait if needed looks overcomplicated and most probably will result in performance drop. I think there is no way to have this guarantee with PRIMARY_SYNC in general case. Sergi 2017-04-18 13:25 GMT+03:00 Yakov Zhdanov <[hidden email]>: > Guys, what if we look at this from another point - we can switch to read > from primary only if there is any primary_sync operation that is not acked > by backups yet. Or we can wait until all operations of the kind are acked > and then proceed with query. This seems to work when we have query after > sequence of puts, but fails if we have sequence of puts then compute job > spawning a query from remote node. And this seems to bring lots of > complications to cache update protocol. > > Given this I would vote for switching default (probably, for replicated > cache only) to full_sync and output a performance warning. > > However, there is still an open question - how can I guarantee query > consistency with primary_sync? > > --Yakov > |
Sergi, most probably, performance will not be vert much affected since we
can batch delayed responses and this for primary_sync only. However, I agree with your point about overcomplicated code, but I did not state that it would be trivial. So, what is the solution here? Switch replicated cache to full_sync by default? This seems very simple from coding standpoint and should work for most deployments. --Yakov |
I agree with Yakov. Let's not swing like a pendulum from one side to the
other. Why not switch to FULL_SYNC only for REPLICATED caches? D. On Tue, Apr 18, 2017 at 5:48 AM, Yakov Zhdanov <[hidden email]> wrote: > Sergi, most probably, performance will not be vert much affected since we > can batch delayed responses and this for primary_sync only. However, I > agree with your point about overcomplicated code, but I did not state that > it would be trivial. > > So, what is the solution here? Switch replicated cache to full_sync by > default? This seems very simple from coding standpoint and should work for > most deployments. > > --Yakov > |
In reply to this post by Vladimir Ozerov
Guys, hope i can add one more example here.
Ones we use IgniteAtomicSequence, after topology changes some assertions can be catched due to default AtomicConfiguration i.e. public static final int DFLT_BACKUPS = 0; public static final CacheMode DFLT_CACHE_MODE = PARTITIONED; minimal improvements here would be to set DFLT_BACKUPS = 1; or change into REPLICATED mode. thanks. > Folks, > > I received a number of complaints from users that our default setting > favor > performance at the cost of correctness and subtle behavior. Yesterday I > faced one such situation on my own. > > I started REPLICATED cache on several nodes, put some data, executed > simple > SQL and got wrong result. No errors, no warnings. The problem was caused > by > default PRIMARY_SYNC mode. WTF, our cache doesn't work out of the box! > > Another widely known examples are data streamer behavior, "read form > backups" + continuous queries. > > I propose to change our defaults to favor *correctness* over performance, > and create good documentation and JavaDocs to explain users how to tune > our > product. Proposed changes: > > 1) FULL_SYNC as default; > 2) "readFromBackups=false" as default; > 3) "IgniteDataStreamer.allowOverwrite=true" as default. > > Users should not think how to make Ignite work correctly. It should be > correct out of the box. > > Vladimir. |
Free forum by Nabble | Edit this page |