Sergi,
Presently in the DML statement doc we mention the following limitation related to SELECT subqueries http://apacheignite.gridgain.org/docs/dml#section-subqueries-in-where-clause <http://apacheignite.gridgain.org/docs/dml#section-subqueries-in-where-clause> Does it mean that this limitation will be no longer true for DML statements as well once the improvement below gets released? https://issues.apache.org/jira/browse/IGNITE-3860 <https://issues.apache.org/jira/browse/IGNITE-3860> — Denis |
1. Actually this doc looks misleading and wrong to me. What does it mean "it
will not be distributed"? It only means that we will not execute any distributed joins inside of this subquery. In all other means this operation will be correctly distributed. We will generate query like this : SELECT _key, _val FROM Person WHERE _key IN (subquery) And execute it on all the nodes and if the subquery is correctly colocated with Person everything will work fine. 2. This is not something we are going to change because subquery will be executed for each row in Person. If it will have a distributed join in it, this operation will never return and most probably will kill the cluster by message flooding. Sergi 2016-12-23 3:05 GMT+03:00 Denis Magda <[hidden email]>: > Sergi, > > Presently in the DML statement doc we mention the following limitation > related to SELECT subqueries > http://apacheignite.gridgain.org/docs/dml#section- > subqueries-in-where-clause > > Does it mean that this limitation will be no longer true for DML > statements as well once the improvement below gets released? > https://issues.apache.org/jira/browse/IGNITE-3860 > > — > Denis > |
Sergi,
Thanks for the feedback. I’ve updated the doc. Please have a look at it one more time and do additional corrections if it help to make things clearer https://apacheignite.readme.io/v1.8/docs/dml#known-limitations <https://apacheignite.readme.io/v1.8/docs/dml#known-limitations> If refer to the ticket are you working on [1], could you list the cases when the subqueries will be distributed further at the main query is executed on the node? In the ticket you briefly mention group by, aggregation and unions. Would be great to document this info providing examples. [1] https://issues.apache.org/jira/browse/IGNITE-3860 <https://issues.apache.org/jira/browse/IGNITE-3860> — Denis > On Dec 22, 2016, at 10:31 PM, Sergi Vladykin <[hidden email]> wrote: > > 1. Actually this doc looks misleading and wrong to me. What does it mean "it will not be distributed"? It only means that we will not execute any distributed joins inside of this subquery. In all other means this operation will be correctly distributed. > > We will generate query like this : > > SELECT _key, _val FROM Person WHERE _key IN (subquery) > > And execute it on all the nodes and if the subquery is correctly colocated with Person everything will work fine. > > 2. This is not something we are going to change because subquery will be executed for each row in Person. If it will have a distributed join in it, this operation will never return and most probably will kill the cluster by message flooding. > > Sergi > > > > 2016-12-23 3:05 GMT+03:00 Denis Magda <[hidden email] <mailto:[hidden email]>>: > Sergi, > > Presently in the DML statement doc we mention the following limitation related to SELECT subqueries > http://apacheignite.gridgain.org/docs/dml#section-subqueries-in-where-clause <http://apacheignite.gridgain.org/docs/dml#section-subqueries-in-where-clause> > > Does it mean that this limitation will be no longer true for DML statements as well once the improvement below gets released? > https://issues.apache.org/jira/browse/IGNITE-3860 <https://issues.apache.org/jira/browse/IGNITE-3860> > > — > Denis > |
Free forum by Nabble | Edit this page |