Igniters,
Please take a look at how the affinity syntax works in Google Spanner: https://stackoverflow.com/questions/46903159/can-i-have- multiple-tables-with-same-parent-in-google-spanner I find it rather nice. Perhaps we can borrow from it. D. |
Dima,
Yes, I saw it also. But this is not about syntax only. Spanner use this information to store data efficiently - child entries a located near to their parents. We can think of it as if all related tables were logical caches inside one physical cache, sorted by the key. With this storage format it will be possible to implement very efficient co-located joins. On Wed, Oct 25, 2017 at 1:26 PM, Dmitriy Setrakyan <[hidden email]> wrote: > Igniters, > > Please take a look at how the affinity syntax works in Google Spanner: > https://stackoverflow.com/questions/46903159/can-i-have- > multiple-tables-with-same-parent-in-google-spanner > > I find it rather nice. Perhaps we can borrow from it. > > D. > |
On Wed, Oct 25, 2017 at 3:32 AM, Vladimir Ozerov <[hidden email]>
wrote: > Dima, > > Yes, I saw it also. But this is not about syntax only. Spanner use this > information to store data efficiently - child entries a located near to > their parents. We can think of it as if all related tables were logical > caches inside one physical cache, sorted by the key. With this storage > format it will be possible to implement very efficient co-located joins. > Hm... I don't think Ignite's approach for collocated joins is lees efficient. However, back to Spanner, the first value in the child table key is the parent table key. This tells me that Spanner collocates based on an approach very similar to Ignite's affinity key. Am I wrong? |
In Spanner once parent key is found you don't need to search for child keys
from scratch - they are located just after the parent key in the tree. In Ignite child and parent keys are located in different trees, hence more lookups are needed. On Wed, Oct 25, 2017 at 1:36 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Wed, Oct 25, 2017 at 3:32 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > Dima, > > > > Yes, I saw it also. But this is not about syntax only. Spanner use this > > information to store data efficiently - child entries a located near to > > their parents. We can think of it as if all related tables were logical > > caches inside one physical cache, sorted by the key. With this storage > > format it will be possible to implement very efficient co-located joins. > > > > Hm... I don't think Ignite's approach for collocated joins is lees > efficient. However, back to Spanner, the first value in the child table key > is the parent table key. This tells me that Spanner collocates based on an > approach very similar to Ignite's affinity key. Am I wrong? > |
For example, currently every table in Ignite has at least two PK indexes -
one for cache operations, and another one for H2. If you have two tables (parent - child), you have either 4 indexes (if they are in different groups), or 3 indexes (same logical group). But even if certain tree is shared between caches, dependent data entries are located in completely different parts. With Spanner we will need only 1 index for both tables. On Wed, Oct 25, 2017 at 1:39 PM, Vladimir Ozerov <[hidden email]> wrote: > In Spanner once parent key is found you don't need to search for child > keys from scratch - they are located just after the parent key in the tree. > In Ignite child and parent keys are located in different trees, hence more > lookups are needed. > > On Wed, Oct 25, 2017 at 1:36 PM, Dmitriy Setrakyan <[hidden email]> > wrote: > >> On Wed, Oct 25, 2017 at 3:32 AM, Vladimir Ozerov <[hidden email]> >> wrote: >> >> > Dima, >> > >> > Yes, I saw it also. But this is not about syntax only. Spanner use this >> > information to store data efficiently - child entries a located near to >> > their parents. We can think of it as if all related tables were logical >> > caches inside one physical cache, sorted by the key. With this storage >> > format it will be possible to implement very efficient co-located joins. >> > >> >> Hm... I don't think Ignite's approach for collocated joins is lees >> efficient. However, back to Spanner, the first value in the child table >> key >> is the parent table key. This tells me that Spanner collocates based on an >> approach very similar to Ignite's affinity key. Am I wrong? >> > > |
Thanks, Vladimir, got it. However, even though we may have different
indexes, the data should be on the same node, if it is properly collocated. I do agree that we should try to remove extra index lookups, if possible. Do we have a ticket for it? Is it a lot of work? D. On Wed, Oct 25, 2017 at 3:42 AM, Vladimir Ozerov <[hidden email]> wrote: > For example, currently every table in Ignite has at least two PK indexes - > one for cache operations, and another one for H2. If you have two tables > (parent - child), you have either 4 indexes (if they are in different > groups), or 3 indexes (same logical group). But even if certain tree is > shared between caches, dependent data entries are located in completely > different parts. > > With Spanner we will need only 1 index for both tables. > > On Wed, Oct 25, 2017 at 1:39 PM, Vladimir Ozerov <[hidden email]> > wrote: > > > In Spanner once parent key is found you don't need to search for child > > keys from scratch - they are located just after the parent key in the > tree. > > In Ignite child and parent keys are located in different trees, hence > more > > lookups are needed. > > > > On Wed, Oct 25, 2017 at 1:36 PM, Dmitriy Setrakyan < > [hidden email]> > > wrote: > > > >> On Wed, Oct 25, 2017 at 3:32 AM, Vladimir Ozerov <[hidden email]> > >> wrote: > >> > >> > Dima, > >> > > >> > Yes, I saw it also. But this is not about syntax only. Spanner use > this > >> > information to store data efficiently - child entries a located near > to > >> > their parents. We can think of it as if all related tables were > logical > >> > caches inside one physical cache, sorted by the key. With this storage > >> > format it will be possible to implement very efficient co-located > joins. > >> > > >> > >> Hm... I don't think Ignite's approach for collocated joins is lees > >> efficient. However, back to Spanner, the first value in the child table > >> key > >> is the parent table key. This tells me that Spanner collocates based on > an > >> approach very similar to Ignite's affinity key. Am I wrong? > >> > > > > > |
Dima,
I filed a ticket [1]. This is very promising approach which will allow us to replace regular joins with nested tables in many cases, thus boosting performance of JOINs and making Ignite's affinity configuration easier form UX perspecitve. But this is a big thing, as it would require fully-fledged SQL parser (we cannot process SELECT statements at the moment) and quire a few changes to our B+Tree. [1] https://issues.apache.org/jira/browse/IGNITE-7038 On Tue, Oct 31, 2017 at 6:28 AM, Dmitriy Setrakyan <[hidden email]> wrote: > Thanks, Vladimir, got it. However, even though we may have different > indexes, the data should be on the same node, if it is properly collocated. > I do agree that we should try to remove extra index lookups, if possible. > Do we have a ticket for it? Is it a lot of work? > > D. > > On Wed, Oct 25, 2017 at 3:42 AM, Vladimir Ozerov <[hidden email]> > wrote: > > > For example, currently every table in Ignite has at least two PK indexes > - > > one for cache operations, and another one for H2. If you have two tables > > (parent - child), you have either 4 indexes (if they are in different > > groups), or 3 indexes (same logical group). But even if certain tree is > > shared between caches, dependent data entries are located in completely > > different parts. > > > > With Spanner we will need only 1 index for both tables. > > > > On Wed, Oct 25, 2017 at 1:39 PM, Vladimir Ozerov <[hidden email]> > > wrote: > > > > > In Spanner once parent key is found you don't need to search for child > > > keys from scratch - they are located just after the parent key in the > > tree. > > > In Ignite child and parent keys are located in different trees, hence > > more > > > lookups are needed. > > > > > > On Wed, Oct 25, 2017 at 1:36 PM, Dmitriy Setrakyan < > > [hidden email]> > > > wrote: > > > > > >> On Wed, Oct 25, 2017 at 3:32 AM, Vladimir Ozerov < > [hidden email]> > > >> wrote: > > >> > > >> > Dima, > > >> > > > >> > Yes, I saw it also. But this is not about syntax only. Spanner use > > this > > >> > information to store data efficiently - child entries a located near > > to > > >> > their parents. We can think of it as if all related tables were > > logical > > >> > caches inside one physical cache, sorted by the key. With this > storage > > >> > format it will be possible to implement very efficient co-located > > joins. > > >> > > > >> > > >> Hm... I don't think Ignite's approach for collocated joins is lees > > >> efficient. However, back to Spanner, the first value in the child > table > > >> key > > >> is the parent table key. This tells me that Spanner collocates based > on > > an > > >> approach very similar to Ignite's affinity key. Am I wrong? > > >> > > > > > > > > > |
Free forum by Nabble | Edit this page |