New SQL execution engine

classic Classic list List threaded Threaded
41 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Maxim Muzafarov
Folks,
especially Ignite PMCs,

Are there any plans about how Ignite SQL will be evolved? It is a very
interesting thread on how Ignite SQL as a product will be developed
for the near future e.g. supporting new standards etc.

According to documentation Ignite complies with SQL ANSI-99 [2] but in
fact (correct me if I'm wrong) it doesn't support recursive queries
[1] (the issue mentioned by Andrey), right? Will it be solvable by the
new engine?

[1] https://issues.apache.org/jira/browse/IGNITE-5475
[2] http://ignite.apache.org/use-cases/database/sql-database.html

On Fri, 27 Sep 2019 at 17:22, Nikolay Izhikov <[hidden email]> wrote:

>
> Igor.
>
> > The main issue - there is no *selection*.
>
> 1. I don't remember community decision about this.
>
> 2. We should avoid to make such long-term decision so quickly.
> We done this kind of decision with H2 and come to the point when we should review it.
>
> > 1) Implementing white papers from scratch
> > 2) Adopting Calcite to our needs.
>
> The third option don't fix issues we have with H2.
> The fourth option I know is using spark-catalyst.
>
> What is wrong with writing engine from scratch?
>
> I ask you to start with engine requirements.
> Can we, please, discuss it?
>
> > If you have an alternative - you're welcome, I'll gratefully listen to you.
>
> We have alternative for now - H2 based engine.
>
> > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
>
> When we make a decision about engine we can discuss roadmap for replacement.
> One more time - replacement of SQL engine to some more customizable make sense for me.
> But, this kind of decisions need carefull discussion.
>
> В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет:
> > Nikolay,
> >
> > The main issue - there is no *selection*.
> >
> > There is a field of knowledge - relational algebra, which describes how to transform relational expressions saving their semantics, and a couple of implementations (Calcite is only one written in Java).
> >
> > There are only two alternatives:
> >
> > 1) Implementing white papers from scratch
> > 2) Adopting Calcite to our needs.
> >
> > The second way was chosen by several other projects, there is experience, there is a list of known issues (like using indexes) so, almost everything is already done for us.
> >
> > Implementing a planner is a big deal, I think anybody understands it there. That's why our proposal to reuse others experience is obvious.
> >
> > If you have an alternative - you're welcome, I'll gratefully listen to you.
> >
> > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
> >
> > Regards,
> > Igor
> >
> > > 27 сент. 2019 г., в 16:37, Nikolay Izhikov <[hidden email]> написал(а):
> > >
> > > Roman.
> > >
> > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > a more constructive way.
> > >
> > > What is SQL team?
> > > I only know Ignite community :)
> > >
> > > Please, share you knowledge in IEP.
> > > I want to join to the process of engine *selection*.
> > > It should start with the requirements to such engine.
> > > Can you write it in IEP, please?
> > >
> > > My point is very simple:
> > >
> > > 1. We made the wrong decision with H2
> > > 2. We should make a well-thought decision about the new engine.
> > >
> > > > How many tickets would satisfy you?
> > >
> > > You write about "issueS" with the H2.
> > > All I see is one open ticket.
> > > IEP doesn't provide enough information.
> > > So it's not about the number of tickets, it's about
> > >
> > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > are the main problems with the current engine.
> > >
> > > We may come to the point when Calcite(or any other engine) brings us third and other "main problems".
> > > This is how it happens with H2.
> > >
> > > Let's start from what we want to get with the engine and move forward from this base.
> > > What do you think?
> > >
> > >
> > >
> > > В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
> > > > Maxim, Nikolay,
> > > >
> > > > I've listed two issues which show the ideological flaws of the current
> > > > engine.
> > > >
> > > > 1. IGNITE-11448 - Open. This ticket describes the impossibility of
> > > > executing queries which can not be fit in the hardcoded one pass
> > > > map-reduce paradigm.
> > > >
> > > > 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
> > > > major problem with the current engine: H2 query optimizer is very
> > > > primitive and can not perform many useful optimizations.
> > > >
> > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > are the main problems with the current engine. It means that our engine
> > > > is currently  suitable for execution only a very limited subset of the
> > > > typical SQL queries. For example it can not even run most of the TPC-H
> > > > benchmark queries because they don't fit to the simple map-reduce paradigm.
> > > >
> > > > > All I see is links to two tickets:
> > > >
> > > > How many tickets would satisfy you? I named two. And it looks like it is
> > > > not enough from your point of view. Ok, so how many is enough? The set
> > > > of problems caused by listed above tickets is infinite, therefore I can
> > > > not create a ticket for each of them.
> > > > > Tech details also should be added.
> > > >
> > > > Tech details are in the tickets.
> > > >
> > > > > We can't discuss such a huge change as an execution engine replacement with descrition like:
> > > > > "No data co-location control, i.e. arbitrary data can be returned silently" or
> > > > > "Low control on how query executes internally, as a result we have limited possibility to implement improvements/fixes."
> > > >
> > > > Why not? Don't you understand these problems? Or you don't think this is
> > > > a problem?
> > > >
> > > > > Let's make these descriptions more specific.
> > > >
> > > > What do you mean by "more specific"? What is the criteria of the
> > > > specific description?
> > > >
> > > >
> > > >
> > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > a more constructive way.
> > > >
> > > > Thank you!
> >
> >
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Nikolay Izhikov-2
In reply to this post by Andrew Mashenkov
Hello, Andrey.

Thanks, it's more clear now.

> I agree, we should make IEP clear to everyone in community who want to be involved in IEP implementation at first.

Great!
Looking forward for IEP clarification.


В Пт, 27/09/2019 в 18:07 +0300, Andrey Mashenkov пишет:

> Nikolay, Igor.
>
> Implementing from scratch is an option, of course.
> If we decide to go this way then we definitely won't to spend long nights
> to invent "yet another SQL parser" with all the stuff related to query
> rewrite rules (e.g. IN -> JOIN) or type casting \ validation \ conversion.
>
> We thought about step-by-step H2 replacing.
> 1. We've tried to make POC with parser replacement to generated one from
> SQL grammar with ASM,
> but this approach looks slow, AFAIR. Gridgainers, anybody, have smth on
> this?
>
> 2. Then we need a planner with all the rules.
> Of course we will need to write rules optimized for "Distributed" execution
> in anyway, but I doubt anybody want to write common-rules that already has
> Calcite.
> We can copy-paste, but what for?
>
> 3. Then we have to implement execution pipeline.
> Possibly, we can adopt new query plans for H2 execution, but then we will
> still have same pain with resolving H2 internal issues (e.g. OOM).
> H2 approach is outdated, it doesn't fit Ignite needs as distributes system.
>
> With Calcite we can concentrate on 2 and (mostly) 3 points and reuse
> their architectural abstracts, otherwise we should reinvent those abstracts
> through long discussions on dev-list.
>
> I agree, we should make IEP clear to everyone in community who want to be
> involved in IEP implementation at first.
> Both approaches ("from scratch" and  "with Calcite") are risky, so
>
> Can we try to make an additional engine "beta"-implementation and allow
> users fallback to old engine until a new one will be decided to become
> mature enough.
>
>
>
>
> On Fri, Sep 27, 2019 at 5:08 PM Seliverstov Igor <[hidden email]>
> wrote:
>
> > Nikolay,
> >
> > The main issue - there is no *selection*.
> >
> > There is a field of knowledge - relational algebra, which describes how to
> > transform relational expressions saving their semantics, and a couple of
> > implementations (Calcite is only one written in Java).
> >
> > There are only two alternatives:
> >
> > 1) Implementing white papers from scratch
> > 2) Adopting Calcite to our needs.
> >
> > The second way was chosen by several other projects, there is experience,
> > there is a list of known issues (like using indexes) so, almost everything
> > is already done for us.
> >
> > Implementing a planner is a big deal, I think anybody understands it
> > there. That's why our proposal to reuse others experience is obvious.
> >
> > If you have an alternative - you're welcome, I'll gratefully listen to you.
> >
> > The main question isn't "WHAT" but "HOW" - that's the discussion topic
> > from my point of view.
> >
> > Regards,
> > Igor
> >
> > > 27 сент. 2019 г., в 16:37, Nikolay Izhikov <[hidden email]>
> >
> > написал(а):
> > >
> > > Roman.
> > >
> > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > for you as it obvious for SQL team. So, please arrange your questions
> >
> > in
> > > > a more constructive way.
> > >
> > > What is SQL team?
> > > I only know Ignite community :)
> > >
> > > Please, share you knowledge in IEP.
> > > I want to join to the process of engine *selection*.
> > > It should start with the requirements to such engine.
> > > Can you write it in IEP, please?
> > >
> > > My point is very simple:
> > >
> > > 1. We made the wrong decision with H2
> > > 2. We should make a well-thought decision about the new engine.
> > >
> > > > How many tickets would satisfy you?
> > >
> > > You write about "issueS" with the H2.
> > > All I see is one open ticket.
> > > IEP doesn't provide enough information.
> > > So it's not about the number of tickets, it's about
> > >
> > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > are the main problems with the current engine.
> > >
> > > We may come to the point when Calcite(or any other engine) brings us
> >
> > third and other "main problems".
> > > This is how it happens with H2.
> > >
> > > Let's start from what we want to get with the engine and move forward
> >
> > from this base.
> > > What do you think?
> > >
> > >
> > >
> > > В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
> > > > Maxim, Nikolay,
> > > >
> > > > I've listed two issues which show the ideological flaws of the current
> > > > engine.
> > > >
> > > > 1. IGNITE-11448 - Open. This ticket describes the impossibility of
> > > > executing queries which can not be fit in the hardcoded one pass
> > > > map-reduce paradigm.
> > > >
> > > > 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
> > > > major problem with the current engine: H2 query optimizer is very
> > > > primitive and can not perform many useful optimizations.
> > > >
> > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > are the main problems with the current engine. It means that our engine
> > > > is currently  suitable for execution only a very limited subset of the
> > > > typical SQL queries. For example it can not even run most of the TPC-H
> > > > benchmark queries because they don't fit to the simple map-reduce
> >
> > paradigm.
> > > >
> > > > > All I see is links to two tickets:
> > > >
> > > > How many tickets would satisfy you? I named two. And it looks like it
> >
> > is
> > > > not enough from your point of view. Ok, so how many is enough? The set
> > > > of problems caused by listed above tickets is infinite, therefore I can
> > > > not create a ticket for each of them.
> > > > > Tech details also should be added.
> > > >
> > > > Tech details are in the tickets.
> > > >
> > > > > We can't discuss such a huge change as an execution engine replacement
> >
> > with descrition like:
> > > > > "No data co-location control, i.e. arbitrary data can be returned
> >
> > silently" or
> > > > > "Low control on how query executes internally, as a result we have
> >
> > limited possibility to implement improvements/fixes."
> > > >
> > > > Why not? Don't you understand these problems? Or you don't think this
> >
> > is
> > > > a problem?
> > > >
> > > > > Let's make these descriptions more specific.
> > > >
> > > > What do you mean by "more specific"? What is the criteria of the
> > > > specific description?
> > > >
> > > >
> > > >
> > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > for you as it obvious for SQL team. So, please arrange your questions
> >
> > in
> > > > a more constructive way.
> > > >
> > > > Thank you!
> >
> >
>
>

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Nikolay Izhikov-2
In reply to this post by gvvinblade
Igor.

> There is no decision, here we should decide.

Great.

> At now Calcite based engine is placed in different module

What project hosted Calcite based engine?

> It’s possible to develop it as an experimental extension at first (not a replacement)

For me, Ignite 3 are the place where the new engine has to be placed.
Personally, I'm against the support of two independent implementation of SQL engine for several releases.

Ignite has too many partially implemented features to include on more :)

Let's start with the IEP clarification and replace the SQL engine with the best one for Ignite good.


В Пт, 27/09/2019 в 18:08 +0300, Seliverstov Igor пишет:

> Nikolay,
>
> At last we have better questions.
>
> There is no decision, here we should decide.
>
> Doing nothing isn’t a decision, it’s just doing nothing
>
> Spark Catalyst is a good example, but under the hood it has absolutely the same idea, but adopted to Spark. Calcite is the same, but general. That’s why it’s better start point.
>
> Implementing an engine from scratch is really cool, but looks like inventing a bicycle, don’t think it makes sense. At least I against this option.
>
> I added requirements to IEP (as you asked), you may see it’s in DRAFT state and will be complemented by details.
>
> We have some thoughts on how to make smooth replacement, but at first we should decide what to replace and what with.
>
> At now Calcite based engine is placed in different module, we checked it can build execution graph for both local and distributed cases, it has good expandability.
> We talked to Calcite community to identify possible future issues and everything points to the fact it’s the best option.
> It’s possible to develop it as an experimental extension at first (not a replacement) until we make sure that it works as expected. This way there are no risks for anybody who uses Ignite on production environment.
>
> Regards,
> Igor
>
>
> > 27 сент. 2019 г., в 17:25, Nikolay Izhikov <[hidden email]> написал(а):
> >
> > Igor.
> >
> > > The main issue - there is no *selection*.
> >
> > 1. I don't remember community decision about this.
> >
> > 2. We should avoid to make such long-term decision so quickly.
> > We done this kind of decision with H2 and come to the point when we should review it.
> >
> > > 1) Implementing white papers from scratch
> > > 2) Adopting Calcite to our needs.
> >
> > The third option don't fix issues we have with H2.
> > The fourth option I know is using spark-catalyst.
> >
> > What is wrong with writing engine from scratch?
> >
> > I ask you to start with engine requirements.
> > Can we, please, discuss it?
> >
> > > If you have an alternative - you're welcome, I'll gratefully listen to you.
> >
> > We have alternative for now - H2 based engine.
> >
> > > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
> >
> > When we make a decision about engine we can discuss roadmap for replacement.
> > One more time - replacement of SQL engine to some more customizable make sense for me.
> > But, this kind of decisions need carefull discussion.
> >
> > В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет:
> > > Nikolay,
> > >
> > > The main issue - there is no *selection*.
> > >
> > > There is a field of knowledge - relational algebra, which describes how to transform relational expressions saving their semantics, and a couple of implementations (Calcite is only one written in Java).
> > >
> > > There are only two alternatives:
> > >
> > > 1) Implementing white papers from scratch
> > > 2) Adopting Calcite to our needs.
> > >
> > > The second way was chosen by several other projects, there is experience, there is a list of known issues (like using indexes) so, almost everything is already done for us.
> > >
> > > Implementing a planner is a big deal, I think anybody understands it there. That's why our proposal to reuse others experience is obvious.
> > >
> > > If you have an alternative - you're welcome, I'll gratefully listen to you.
> > >
> > > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
> > >
> > > Regards,
> > > Igor
> > >
> > > > 27 сент. 2019 г., в 16:37, Nikolay Izhikov <[hidden email]> написал(а):
> > > >
> > > > Roman.
> > > >
> > > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > > a more constructive way.
> > > >
> > > > What is SQL team?
> > > > I only know Ignite community :)
> > > >
> > > > Please, share you knowledge in IEP.
> > > > I want to join to the process of engine *selection*.
> > > > It should start with the requirements to such engine.
> > > > Can you write it in IEP, please?
> > > >
> > > > My point is very simple:
> > > >
> > > > 1. We made the wrong decision with H2
> > > > 2. We should make a well-thought decision about the new engine.
> > > >
> > > > > How many tickets would satisfy you?
> > > >
> > > > You write about "issueS" with the H2.
> > > > All I see is one open ticket.
> > > > IEP doesn't provide enough information.
> > > > So it's not about the number of tickets, it's about
> > > >
> > > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > > are the main problems with the current engine.
> > > >
> > > > We may come to the point when Calcite(or any other engine) brings us third and other "main problems".
> > > > This is how it happens with H2.
> > > >
> > > > Let's start from what we want to get with the engine and move forward from this base.
> > > > What do you think?
> > > >
> > > >
> > > >
> > > > В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
> > > > > Maxim, Nikolay,
> > > > >
> > > > > I've listed two issues which show the ideological flaws of the current
> > > > > engine.
> > > > >
> > > > > 1. IGNITE-11448 - Open. This ticket describes the impossibility of
> > > > > executing queries which can not be fit in the hardcoded one pass
> > > > > map-reduce paradigm.
> > > > >
> > > > > 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
> > > > > major problem with the current engine: H2 query optimizer is very
> > > > > primitive and can not perform many useful optimizations.
> > > > >
> > > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > > are the main problems with the current engine. It means that our engine
> > > > > is currently  suitable for execution only a very limited subset of the
> > > > > typical SQL queries. For example it can not even run most of the TPC-H
> > > > > benchmark queries because they don't fit to the simple map-reduce paradigm.
> > > > >
> > > > > > All I see is links to two tickets:
> > > > >
> > > > > How many tickets would satisfy you? I named two. And it looks like it is
> > > > > not enough from your point of view. Ok, so how many is enough? The set
> > > > > of problems caused by listed above tickets is infinite, therefore I can
> > > > > not create a ticket for each of them.
> > > > > > Tech details also should be added.
> > > > >
> > > > > Tech details are in the tickets.
> > > > >
> > > > > > We can't discuss such a huge change as an execution engine replacement with descrition like:
> > > > > > "No data co-location control, i.e. arbitrary data can be returned silently" or
> > > > > > "Low control on how query executes internally, as a result we have limited possibility to implement improvements/fixes."
> > > > >
> > > > > Why not? Don't you understand these problems? Or you don't think this is
> > > > > a problem?
> > > > >
> > > > > > Let's make these descriptions more specific.
> > > > >
> > > > > What do you mean by "more specific"? What is the criteria of the
> > > > > specific description?
> > > > >
> > > > >
> > > > >
> > > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > > a more constructive way.
> > > > >
> > > > > Thank you!
> > >
> > >
>
>

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

gvvinblade
Nikolay,

> What project hosted Calcite based engine?


Currently the prototype is placed in my personal Ignite fork. I need an appropriate ticket before pushing it to ASF git repository.
At first, I think, we should discuss the idea in general.

> Personally, I'm against the support of two independent implementation of SQL engine for several releases.


I don’t like the idea to have two engines too. But even development the engine on top of Calcite library is still a big deal.
I not sure it will be ready, no, I sure it WONT be ready by Ignite3 release. So I mentioned the option to have two engines at the same time.

> Let's start with the IEP clarification and replace the SQL engine with the best one for Ignite good.

Of course, but anyway it’s good to make familiar with a couple of examples it already describes and clarify some additional questions the community may ask.

Regards,
Igor

> 27 сент. 2019 г., в 18:22, Nikolay Izhikov <[hidden email]> написал(а):
>
> Igor.
>
>> There is no decision, here we should decide.
>
> Great.
>
>> At now Calcite based engine is placed in different module
>
> What project hosted Calcite based engine?
>
>> It’s possible to develop it as an experimental extension at first (not a replacement)
>
> For me, Ignite 3 are the place where the new engine has to be placed.
> Personally, I'm against the support of two independent implementation of SQL engine for several releases.
>
> Ignite has too many partially implemented features to include on more :)
>
> Let's start with the IEP clarification and replace the SQL engine with the best one for Ignite good.
>
>
> В Пт, 27/09/2019 в 18:08 +0300, Seliverstov Igor пишет:
>> Nikolay,
>>
>> At last we have better questions.
>>
>> There is no decision, here we should decide.
>>
>> Doing nothing isn’t a decision, it’s just doing nothing
>>
>> Spark Catalyst is a good example, but under the hood it has absolutely the same idea, but adopted to Spark. Calcite is the same, but general. That’s why it’s better start point.
>>
>> Implementing an engine from scratch is really cool, but looks like inventing a bicycle, don’t think it makes sense. At least I against this option.
>>
>> I added requirements to IEP (as you asked), you may see it’s in DRAFT state and will be complemented by details.
>>
>> We have some thoughts on how to make smooth replacement, but at first we should decide what to replace and what with.
>>
>> At now Calcite based engine is placed in different module, we checked it can build execution graph for both local and distributed cases, it has good expandability.
>> We talked to Calcite community to identify possible future issues and everything points to the fact it’s the best option.
>> It’s possible to develop it as an experimental extension at first (not a replacement) until we make sure that it works as expected. This way there are no risks for anybody who uses Ignite on production environment.
>>
>> Regards,
>> Igor
>>
>>
>>> 27 сент. 2019 г., в 17:25, Nikolay Izhikov <[hidden email]> написал(а):
>>>
>>> Igor.
>>>
>>>> The main issue - there is no *selection*.
>>>
>>> 1. I don't remember community decision about this.
>>>
>>> 2. We should avoid to make such long-term decision so quickly.
>>> We done this kind of decision with H2 and come to the point when we should review it.
>>>
>>>> 1) Implementing white papers from scratch
>>>> 2) Adopting Calcite to our needs.
>>>
>>> The third option don't fix issues we have with H2.
>>> The fourth option I know is using spark-catalyst.
>>>
>>> What is wrong with writing engine from scratch?
>>>
>>> I ask you to start with engine requirements.
>>> Can we, please, discuss it?
>>>
>>>> If you have an alternative - you're welcome, I'll gratefully listen to you.
>>>
>>> We have alternative for now - H2 based engine.
>>>
>>>> The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
>>>
>>> When we make a decision about engine we can discuss roadmap for replacement.
>>> One more time - replacement of SQL engine to some more customizable make sense for me.
>>> But, this kind of decisions need carefull discussion.
>>>
>>> В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет:
>>>> Nikolay,
>>>>
>>>> The main issue - there is no *selection*.
>>>>
>>>> There is a field of knowledge - relational algebra, which describes how to transform relational expressions saving their semantics, and a couple of implementations (Calcite is only one written in Java).
>>>>
>>>> There are only two alternatives:
>>>>
>>>> 1) Implementing white papers from scratch
>>>> 2) Adopting Calcite to our needs.
>>>>
>>>> The second way was chosen by several other projects, there is experience, there is a list of known issues (like using indexes) so, almost everything is already done for us.
>>>>
>>>> Implementing a planner is a big deal, I think anybody understands it there. That's why our proposal to reuse others experience is obvious.
>>>>
>>>> If you have an alternative - you're welcome, I'll gratefully listen to you.
>>>>
>>>> The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
>>>>
>>>> Regards,
>>>> Igor
>>>>
>>>>> 27 сент. 2019 г., в 16:37, Nikolay Izhikov <[hidden email]> написал(а):
>>>>>
>>>>> Roman.
>>>>>
>>>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious
>>>>>> for you as it obvious for SQL team. So, please arrange your questions in
>>>>>> a more constructive way.
>>>>>
>>>>> What is SQL team?
>>>>> I only know Ignite community :)
>>>>>
>>>>> Please, share you knowledge in IEP.
>>>>> I want to join to the process of engine *selection*.
>>>>> It should start with the requirements to such engine.
>>>>> Can you write it in IEP, please?
>>>>>
>>>>> My point is very simple:
>>>>>
>>>>> 1. We made the wrong decision with H2
>>>>> 2. We should make a well-thought decision about the new engine.
>>>>>
>>>>>> How many tickets would satisfy you?
>>>>>
>>>>> You write about "issueS" with the H2.
>>>>> All I see is one open ticket.
>>>>> IEP doesn't provide enough information.
>>>>> So it's not about the number of tickets, it's about
>>>>>
>>>>>> These two points (single map-reduce execution and inflexible optimizer)
>>>>>> are the main problems with the current engine.
>>>>>
>>>>> We may come to the point when Calcite(or any other engine) brings us third and other "main problems".
>>>>> This is how it happens with H2.
>>>>>
>>>>> Let's start from what we want to get with the engine and move forward from this base.
>>>>> What do you think?
>>>>>
>>>>>
>>>>>
>>>>> В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
>>>>>> Maxim, Nikolay,
>>>>>>
>>>>>> I've listed two issues which show the ideological flaws of the current
>>>>>> engine.
>>>>>>
>>>>>> 1. IGNITE-11448 - Open. This ticket describes the impossibility of
>>>>>> executing queries which can not be fit in the hardcoded one pass
>>>>>> map-reduce paradigm.
>>>>>>
>>>>>> 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
>>>>>> major problem with the current engine: H2 query optimizer is very
>>>>>> primitive and can not perform many useful optimizations.
>>>>>>
>>>>>> These two points (single map-reduce execution and inflexible optimizer)
>>>>>> are the main problems with the current engine. It means that our engine
>>>>>> is currently  suitable for execution only a very limited subset of the
>>>>>> typical SQL queries. For example it can not even run most of the TPC-H
>>>>>> benchmark queries because they don't fit to the simple map-reduce paradigm.
>>>>>>
>>>>>>> All I see is links to two tickets:
>>>>>>
>>>>>> How many tickets would satisfy you? I named two. And it looks like it is
>>>>>> not enough from your point of view. Ok, so how many is enough? The set
>>>>>> of problems caused by listed above tickets is infinite, therefore I can
>>>>>> not create a ticket for each of them.
>>>>>>> Tech details also should be added.
>>>>>>
>>>>>> Tech details are in the tickets.
>>>>>>
>>>>>>> We can't discuss such a huge change as an execution engine replacement with descrition like:
>>>>>>> "No data co-location control, i.e. arbitrary data can be returned silently" or
>>>>>>> "Low control on how query executes internally, as a result we have limited possibility to implement improvements/fixes."
>>>>>>
>>>>>> Why not? Don't you understand these problems? Or you don't think this is
>>>>>> a problem?
>>>>>>
>>>>>>> Let's make these descriptions more specific.
>>>>>>
>>>>>> What do you mean by "more specific"? What is the criteria of the
>>>>>> specific description?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Nikolay, Maxim, I understand that our arguments may not be as obvious
>>>>>> for you as it obvious for SQL team. So, please arrange your questions in
>>>>>> a more constructive way.
>>>>>>
>>>>>> Thank you!
>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

dmagda
In reply to this post by Igor Seliverstov
Ignite mates, let me try to move the discussion in a constructive way. It
looks like we set a wrong context from the very beginning.

Before proposing this idea to the community, some of us were
discussing/researching the topic in different groups (the one need to think
it through first before even suggesting to consider changes of this
magnitude). The day has come to share this idea with the whole community
and outline the next actions. But (!) nobody is 100% sure that that's the
right decision. Thus, this will be an *experiment*, some of our community
members will be developing a *prototype* and only based on the prototype
outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey, hope
that nothing has changed and we're on the same page here.

Many technical and architectural reasons that justify this project have
been shared but let me throw in my perspective. There is nothing wrong with
H2, that was the right choice for that time.  Thanks to H2 and Ignite SQL
APIs, our project is used across hundreds of deployments who are
accelerating relational databases or use Ignite as a system of records.
However, these days many more companies are migrating to *distributed*
databases that speak SQL. For instance, if a couple of years ago 1 out of
10 use cases needed support for multi-joins queries or queries with
subselects or efficient memory usage then today there are 5 out of 10 use
cases of this kind; in the foreseeable future, it will be a 10 out of 10.
So, the evolution is in progress -- the relational world goes distributed,
it became exhaustive for both Ignite SQL maintainers and experts who help
to tune it for production usage to keep pace with the evolution mostly due
to the H2-dependency. Thus, Ignite SQL has to evolve and has to be ready to
face the future reality.

Luckily, we don't need to rush and don't have the right to rush because
hundreds existing users have already trusted their production environments
to Ignite SQL and we need to roll out changes with such a big impact
carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped in and
agreed to be the first contributors who will be *experimenting* with the
new SQL engine. Let's support them; let's connect them with Apache Calcite
community and see how this story evolves.  Folks, please keep the community
aware of the progress, let us know when help is needed, some of us will be
ready to support with development once you create a solid foundation for
the prototype.

-
Denis


On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <[hidden email]>
wrote:

> Hi Igniters!
>
> As you might know currently we have many open issues relating to current
> H2 based engine and its execution flow.
>
> Some of them are critical (like impossibility to execute particular
> queries), some of them are majors (like impossibility to execute particular
> queries without pre-preparation your data to have a collocation) and many
> minors.
>
> Most of the issues cannot be solved without whole engine redesign.
>
> So, here the proposal:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
>
> I'll appreciate if you share your thoughts on top of that.
>
> Regards,
> Igor
>
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Nikolay Izhikov-2
In reply to this post by gvvinblade
> I think, we should discuss the idea in general.

Everybody likes the idea so far :)
The issues in details, as usual.


В Пт, 27/09/2019 в 19:03 +0300, Seliverstov Igor пишет:

> Nikolay,
>
> > What project hosted Calcite based engine?
>
>
> Currently the prototype is placed in my personal Ignite fork. I need an appropriate ticket before pushing it to ASF git repository.
> At first, I think, we should discuss the idea in general.
>
> > Personally, I'm against the support of two independent implementation of SQL engine for several releases.
>
>
> I don’t like the idea to have two engines too. But even development the engine on top of Calcite library is still a big deal.
> I not sure it will be ready, no, I sure it WONT be ready by Ignite3 release. So I mentioned the option to have two engines at the same time.
>
> > Let's start with the IEP clarification and replace the SQL engine with the best one for Ignite good.
>
> Of course, but anyway it’s good to make familiar with a couple of examples it already describes and clarify some additional questions the community may ask.
>
> Regards,
> Igor
>
> > 27 сент. 2019 г., в 18:22, Nikolay Izhikov <[hidden email]> написал(а):
> >
> > Igor.
> >
> > > There is no decision, here we should decide.
> >
> > Great.
> >
> > > At now Calcite based engine is placed in different module
> >
> > What project hosted Calcite based engine?
> >
> > > It’s possible to develop it as an experimental extension at first (not a replacement)
> >
> > For me, Ignite 3 are the place where the new engine has to be placed.
> > Personally, I'm against the support of two independent implementation of SQL engine for several releases.
> >
> > Ignite has too many partially implemented features to include on more :)
> >
> > Let's start with the IEP clarification and replace the SQL engine with the best one for Ignite good.
> >
> >
> > В Пт, 27/09/2019 в 18:08 +0300, Seliverstov Igor пишет:
> > > Nikolay,
> > >
> > > At last we have better questions.
> > >
> > > There is no decision, here we should decide.
> > >
> > > Doing nothing isn’t a decision, it’s just doing nothing
> > >
> > > Spark Catalyst is a good example, but under the hood it has absolutely the same idea, but adopted to Spark. Calcite is the same, but general. That’s why it’s better start point.
> > >
> > > Implementing an engine from scratch is really cool, but looks like inventing a bicycle, don’t think it makes sense. At least I against this option.
> > >
> > > I added requirements to IEP (as you asked), you may see it’s in DRAFT state and will be complemented by details.
> > >
> > > We have some thoughts on how to make smooth replacement, but at first we should decide what to replace and what with.
> > >
> > > At now Calcite based engine is placed in different module, we checked it can build execution graph for both local and distributed cases, it has good expandability.
> > > We talked to Calcite community to identify possible future issues and everything points to the fact it’s the best option.
> > > It’s possible to develop it as an experimental extension at first (not a replacement) until we make sure that it works as expected. This way there are no risks for anybody who uses Ignite on production environment.
> > >
> > > Regards,
> > > Igor
> > >
> > >
> > > > 27 сент. 2019 г., в 17:25, Nikolay Izhikov <[hidden email]> написал(а):
> > > >
> > > > Igor.
> > > >
> > > > > The main issue - there is no *selection*.
> > > >
> > > > 1. I don't remember community decision about this.
> > > >
> > > > 2. We should avoid to make such long-term decision so quickly.
> > > > We done this kind of decision with H2 and come to the point when we should review it.
> > > >
> > > > > 1) Implementing white papers from scratch
> > > > > 2) Adopting Calcite to our needs.
> > > >
> > > > The third option don't fix issues we have with H2.
> > > > The fourth option I know is using spark-catalyst.
> > > >
> > > > What is wrong with writing engine from scratch?
> > > >
> > > > I ask you to start with engine requirements.
> > > > Can we, please, discuss it?
> > > >
> > > > > If you have an alternative - you're welcome, I'll gratefully listen to you.
> > > >
> > > > We have alternative for now - H2 based engine.
> > > >
> > > > > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
> > > >
> > > > When we make a decision about engine we can discuss roadmap for replacement.
> > > > One more time - replacement of SQL engine to some more customizable make sense for me.
> > > > But, this kind of decisions need carefull discussion.
> > > >
> > > > В Пт, 27/09/2019 в 17:08 +0300, Seliverstov Igor пишет:
> > > > > Nikolay,
> > > > >
> > > > > The main issue - there is no *selection*.
> > > > >
> > > > > There is a field of knowledge - relational algebra, which describes how to transform relational expressions saving their semantics, and a couple of implementations (Calcite is only one written in Java).
> > > > >
> > > > > There are only two alternatives:
> > > > >
> > > > > 1) Implementing white papers from scratch
> > > > > 2) Adopting Calcite to our needs.
> > > > >
> > > > > The second way was chosen by several other projects, there is experience, there is a list of known issues (like using indexes) so, almost everything is already done for us.
> > > > >
> > > > > Implementing a planner is a big deal, I think anybody understands it there. That's why our proposal to reuse others experience is obvious.
> > > > >
> > > > > If you have an alternative - you're welcome, I'll gratefully listen to you.
> > > > >
> > > > > The main question isn't "WHAT" but "HOW" - that's the discussion topic from my point of view.
> > > > >
> > > > > Regards,
> > > > > Igor
> > > > >
> > > > > > 27 сент. 2019 г., в 16:37, Nikolay Izhikov <[hidden email]> написал(а):
> > > > > >
> > > > > > Roman.
> > > > > >
> > > > > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > > > > a more constructive way.
> > > > > >
> > > > > > What is SQL team?
> > > > > > I only know Ignite community :)
> > > > > >
> > > > > > Please, share you knowledge in IEP.
> > > > > > I want to join to the process of engine *selection*.
> > > > > > It should start with the requirements to such engine.
> > > > > > Can you write it in IEP, please?
> > > > > >
> > > > > > My point is very simple:
> > > > > >
> > > > > > 1. We made the wrong decision with H2
> > > > > > 2. We should make a well-thought decision about the new engine.
> > > > > >
> > > > > > > How many tickets would satisfy you?
> > > > > >
> > > > > > You write about "issueS" with the H2.
> > > > > > All I see is one open ticket.
> > > > > > IEP doesn't provide enough information.
> > > > > > So it's not about the number of tickets, it's about
> > > > > >
> > > > > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > > > > are the main problems with the current engine.
> > > > > >
> > > > > > We may come to the point when Calcite(or any other engine) brings us third and other "main problems".
> > > > > > This is how it happens with H2.
> > > > > >
> > > > > > Let's start from what we want to get with the engine and move forward from this base.
> > > > > > What do you think?
> > > > > >
> > > > > >
> > > > > >
> > > > > > В Пт, 27/09/2019 в 16:15 +0300, Roman Kondakov пишет:
> > > > > > > Maxim, Nikolay,
> > > > > > >
> > > > > > > I've listed two issues which show the ideological flaws of the current
> > > > > > > engine.
> > > > > > >
> > > > > > > 1. IGNITE-11448 - Open. This ticket describes the impossibility of
> > > > > > > executing queries which can not be fit in the hardcoded one pass
> > > > > > > map-reduce paradigm.
> > > > > > >
> > > > > > > 2. IGNITE-6085 - Closed (won't fix) - This ticket describes the second
> > > > > > > major problem with the current engine: H2 query optimizer is very
> > > > > > > primitive and can not perform many useful optimizations.
> > > > > > >
> > > > > > > These two points (single map-reduce execution and inflexible optimizer)
> > > > > > > are the main problems with the current engine. It means that our engine
> > > > > > > is currently  suitable for execution only a very limited subset of the
> > > > > > > typical SQL queries. For example it can not even run most of the TPC-H
> > > > > > > benchmark queries because they don't fit to the simple map-reduce paradigm.
> > > > > > >
> > > > > > > > All I see is links to two tickets:
> > > > > > >
> > > > > > > How many tickets would satisfy you? I named two. And it looks like it is
> > > > > > > not enough from your point of view. Ok, so how many is enough? The set
> > > > > > > of problems caused by listed above tickets is infinite, therefore I can
> > > > > > > not create a ticket for each of them.
> > > > > > > > Tech details also should be added.
> > > > > > >
> > > > > > > Tech details are in the tickets.
> > > > > > >
> > > > > > > > We can't discuss such a huge change as an execution engine replacement with descrition like:
> > > > > > > > "No data co-location control, i.e. arbitrary data can be returned silently" or
> > > > > > > > "Low control on how query executes internally, as a result we have limited possibility to implement improvements/fixes."
> > > > > > >
> > > > > > > Why not? Don't you understand these problems? Or you don't think this is
> > > > > > > a problem?
> > > > > > >
> > > > > > > > Let's make these descriptions more specific.
> > > > > > >
> > > > > > > What do you mean by "more specific"? What is the criteria of the
> > > > > > > specific description?
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Nikolay, Maxim, I understand that our arguments may not be as obvious
> > > > > > > for you as it obvious for SQL team. So, please arrange your questions in
> > > > > > > a more constructive way.
> > > > > > >
> > > > > > > Thank you!
> > > > >
> > > > >
> > >
> > >
>
>

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Nikolay Izhikov-2
In reply to this post by dmagda
Hello, Denis.

Thanks for the clarifications.

Sounds good for me.
All I try to say in this thread:
Guys, please, let's take a step back and write down requirements(what we want to get with SQL engine).
Which features and use-cases are primary for us.

I'm sure you have done it, already during your research.

Please, share it with the community.

I'm pretty sure we would back to this document again and again during migration.
So good written design is worth it.

В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:

> Ignite mates, let me try to move the discussion in a constructive way. It
> looks like we set a wrong context from the very beginning.
>
> Before proposing this idea to the community, some of us were
> discussing/researching the topic in different groups (the one need to think
> it through first before even suggesting to consider changes of this
> magnitude). The day has come to share this idea with the whole community
> and outline the next actions. But (!) nobody is 100% sure that that's the
> right decision. Thus, this will be an *experiment*, some of our community
> members will be developing a *prototype* and only based on the prototype
> outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey, hope
> that nothing has changed and we're on the same page here.
>
> Many technical and architectural reasons that justify this project have
> been shared but let me throw in my perspective. There is nothing wrong with
> H2, that was the right choice for that time.  Thanks to H2 and Ignite SQL
> APIs, our project is used across hundreds of deployments who are
> accelerating relational databases or use Ignite as a system of records.
> However, these days many more companies are migrating to *distributed*
> databases that speak SQL. For instance, if a couple of years ago 1 out of
> 10 use cases needed support for multi-joins queries or queries with
> subselects or efficient memory usage then today there are 5 out of 10 use
> cases of this kind; in the foreseeable future, it will be a 10 out of 10.
> So, the evolution is in progress -- the relational world goes distributed,
> it became exhaustive for both Ignite SQL maintainers and experts who help
> to tune it for production usage to keep pace with the evolution mostly due
> to the H2-dependency. Thus, Ignite SQL has to evolve and has to be ready to
> face the future reality.
>
> Luckily, we don't need to rush and don't have the right to rush because
> hundreds existing users have already trusted their production environments
> to Ignite SQL and we need to roll out changes with such a big impact
> carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped in and
> agreed to be the first contributors who will be *experimenting* with the
> new SQL engine. Let's support them; let's connect them with Apache Calcite
> community and see how this story evolves.  Folks, please keep the community
> aware of the progress, let us know when help is needed, some of us will be
> ready to support with development once you create a solid foundation for
> the prototype.
>
> -
> Denis
>
>
> On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <[hidden email]>
> wrote:
>
> > Hi Igniters!
> >
> > As you might know currently we have many open issues relating to current
> > H2 based engine and its execution flow.
> >
> > Some of them are critical (like impossibility to execute particular
> > queries), some of them are majors (like impossibility to execute particular
> > queries without pre-preparation your data to have a collocation) and many
> > minors.
> >
> > Most of the issues cannot be solved without whole engine redesign.
> >
> > So, here the proposal:
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> >
> > I'll appreciate if you share your thoughts on top of that.
> >
> > Regards,
> > Igor
> >

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Ivan Pavlukhin
Folks,

Thanks everyone for a hot discussion! Not every open source community
has such open and boiling discussions. It means that people here
really do care. And I am proud of it!

As I understood, nobody is strictly against the proposed initiative.
And I am glad that we can move forward (with some steps back along the
way).

пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:

>
> Hello, Denis.
>
> Thanks for the clarifications.
>
> Sounds good for me.
> All I try to say in this thread:
> Guys, please, let's take a step back and write down requirements(what we want to get with SQL engine).
> Which features and use-cases are primary for us.
>
> I'm sure you have done it, already during your research.
>
> Please, share it with the community.
>
> I'm pretty sure we would back to this document again and again during migration.
> So good written design is worth it.
>
> В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > Ignite mates, let me try to move the discussion in a constructive way. It
> > looks like we set a wrong context from the very beginning.
> >
> > Before proposing this idea to the community, some of us were
> > discussing/researching the topic in different groups (the one need to think
> > it through first before even suggesting to consider changes of this
> > magnitude). The day has come to share this idea with the whole community
> > and outline the next actions. But (!) nobody is 100% sure that that's the
> > right decision. Thus, this will be an *experiment*, some of our community
> > members will be developing a *prototype* and only based on the prototype
> > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey, hope
> > that nothing has changed and we're on the same page here.
> >
> > Many technical and architectural reasons that justify this project have
> > been shared but let me throw in my perspective. There is nothing wrong with
> > H2, that was the right choice for that time.  Thanks to H2 and Ignite SQL
> > APIs, our project is used across hundreds of deployments who are
> > accelerating relational databases or use Ignite as a system of records.
> > However, these days many more companies are migrating to *distributed*
> > databases that speak SQL. For instance, if a couple of years ago 1 out of
> > 10 use cases needed support for multi-joins queries or queries with
> > subselects or efficient memory usage then today there are 5 out of 10 use
> > cases of this kind; in the foreseeable future, it will be a 10 out of 10.
> > So, the evolution is in progress -- the relational world goes distributed,
> > it became exhaustive for both Ignite SQL maintainers and experts who help
> > to tune it for production usage to keep pace with the evolution mostly due
> > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be ready to
> > face the future reality.
> >
> > Luckily, we don't need to rush and don't have the right to rush because
> > hundreds existing users have already trusted their production environments
> > to Ignite SQL and we need to roll out changes with such a big impact
> > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped in and
> > agreed to be the first contributors who will be *experimenting* with the
> > new SQL engine. Let's support them; let's connect them with Apache Calcite
> > community and see how this story evolves.  Folks, please keep the community
> > aware of the progress, let us know when help is needed, some of us will be
> > ready to support with development once you create a solid foundation for
> > the prototype.
> >
> > -
> > Denis
> >
> >
> > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <[hidden email]>
> > wrote:
> >
> > > Hi Igniters!
> > >
> > > As you might know currently we have many open issues relating to current
> > > H2 based engine and its execution flow.
> > >
> > > Some of them are critical (like impossibility to execute particular
> > > queries), some of them are majors (like impossibility to execute particular
> > > queries without pre-preparation your data to have a collocation) and many
> > > minors.
> > >
> > > Most of the issues cannot be solved without whole engine redesign.
> > >
> > > So, here the proposal:
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > >
> > > I'll appreciate if you share your thoughts on top of that.
> > >
> > > Regards,
> > > Igor
> > >



--
Best regards,
Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

dmagda
Ivan, we need more of these discussions, totally agree with you ;)

I've updated the Motivation paragraph outlining some high-level users we
see by working with our users. Hope it helps. Let's carry on and let me
send a note to Apache Calcite community.

-
Denis


On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin <[hidden email]> wrote:

> Folks,
>
> Thanks everyone for a hot discussion! Not every open source community
> has such open and boiling discussions. It means that people here
> really do care. And I am proud of it!
>
> As I understood, nobody is strictly against the proposed initiative.
> And I am glad that we can move forward (with some steps back along the
> way).
>
> пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:
> >
> > Hello, Denis.
> >
> > Thanks for the clarifications.
> >
> > Sounds good for me.
> > All I try to say in this thread:
> > Guys, please, let's take a step back and write down requirements(what we
> want to get with SQL engine).
> > Which features and use-cases are primary for us.
> >
> > I'm sure you have done it, already during your research.
> >
> > Please, share it with the community.
> >
> > I'm pretty sure we would back to this document again and again during
> migration.
> > So good written design is worth it.
> >
> > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > Ignite mates, let me try to move the discussion in a constructive way.
> It
> > > looks like we set a wrong context from the very beginning.
> > >
> > > Before proposing this idea to the community, some of us were
> > > discussing/researching the topic in different groups (the one need to
> think
> > > it through first before even suggesting to consider changes of this
> > > magnitude). The day has come to share this idea with the whole
> community
> > > and outline the next actions. But (!) nobody is 100% sure that that's
> the
> > > right decision. Thus, this will be an *experiment*, some of our
> community
> > > members will be developing a *prototype* and only based on the
> prototype
> > > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey,
> hope
> > > that nothing has changed and we're on the same page here.
> > >
> > > Many technical and architectural reasons that justify this project have
> > > been shared but let me throw in my perspective. There is nothing wrong
> with
> > > H2, that was the right choice for that time.  Thanks to H2 and Ignite
> SQL
> > > APIs, our project is used across hundreds of deployments who are
> > > accelerating relational databases or use Ignite as a system of records.
> > > However, these days many more companies are migrating to *distributed*
> > > databases that speak SQL. For instance, if a couple of years ago 1 out
> of
> > > 10 use cases needed support for multi-joins queries or queries with
> > > subselects or efficient memory usage then today there are 5 out of 10
> use
> > > cases of this kind; in the foreseeable future, it will be a 10 out of
> 10.
> > > So, the evolution is in progress -- the relational world goes
> distributed,
> > > it became exhaustive for both Ignite SQL maintainers and experts who
> help
> > > to tune it for production usage to keep pace with the evolution mostly
> due
> > > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be
> ready to
> > > face the future reality.
> > >
> > > Luckily, we don't need to rush and don't have the right to rush because
> > > hundreds existing users have already trusted their production
> environments
> > > to Ignite SQL and we need to roll out changes with such a big impact
> > > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped in
> and
> > > agreed to be the first contributors who will be *experimenting* with
> the
> > > new SQL engine. Let's support them; let's connect them with Apache
> Calcite
> > > community and see how this story evolves.  Folks, please keep the
> community
> > > aware of the progress, let us know when help is needed, some of us
> will be
> > > ready to support with development once you create a solid foundation
> for
> > > the prototype.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <
> [hidden email]>
> > > wrote:
> > >
> > > > Hi Igniters!
> > > >
> > > > As you might know currently we have many open issues relating to
> current
> > > > H2 based engine and its execution flow.
> > > >
> > > > Some of them are critical (like impossibility to execute particular
> > > > queries), some of them are majors (like impossibility to execute
> particular
> > > > queries without pre-preparation your data to have a collocation) and
> many
> > > > minors.
> > > >
> > > > Most of the issues cannot be solved without whole engine redesign.
> > > >
> > > > So, here the proposal:
> > > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > > >
> > > > I'll appreciate if you share your thoughts on top of that.
> > > >
> > > > Regards,
> > > > Igor
> > > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin
>
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Nikolay Izhikov-2
Hello, Igniters.

I extends IEP [1] with the tickets caused by H2 limitations.

Please, let's write down requirements for engine in the IEP.

https://cwiki.apache.org/confluence/display/IGNITE/IEP-33%3A+New+SQL+executor+engine+infrastructure

В Пн, 30/09/2019 в 17:20 -0700, Denis Magda пишет:

> Ivan, we need more of these discussions, totally agree with you ;)
>
> I've updated the Motivation paragraph outlining some high-level users we
> see by working with our users. Hope it helps. Let's carry on and let me
> send a note to Apache Calcite community.
>
> -
> Denis
>
>
> On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin <[hidden email]> wrote:
>
> > Folks,
> >
> > Thanks everyone for a hot discussion! Not every open source community
> > has such open and boiling discussions. It means that people here
> > really do care. And I am proud of it!
> >
> > As I understood, nobody is strictly against the proposed initiative.
> > And I am glad that we can move forward (with some steps back along the
> > way).
> >
> > пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:
> > >
> > > Hello, Denis.
> > >
> > > Thanks for the clarifications.
> > >
> > > Sounds good for me.
> > > All I try to say in this thread:
> > > Guys, please, let's take a step back and write down requirements(what we
> >
> > want to get with SQL engine).
> > > Which features and use-cases are primary for us.
> > >
> > > I'm sure you have done it, already during your research.
> > >
> > > Please, share it with the community.
> > >
> > > I'm pretty sure we would back to this document again and again during
> >
> > migration.
> > > So good written design is worth it.
> > >
> > > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > > Ignite mates, let me try to move the discussion in a constructive way.
> >
> > It
> > > > looks like we set a wrong context from the very beginning.
> > > >
> > > > Before proposing this idea to the community, some of us were
> > > > discussing/researching the topic in different groups (the one need to
> >
> > think
> > > > it through first before even suggesting to consider changes of this
> > > > magnitude). The day has come to share this idea with the whole
> >
> > community
> > > > and outline the next actions. But (!) nobody is 100% sure that that's
> >
> > the
> > > > right decision. Thus, this will be an *experiment*, some of our
> >
> > community
> > > > members will be developing a *prototype* and only based on the
> >
> > prototype
> > > > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey,
> >
> > hope
> > > > that nothing has changed and we're on the same page here.
> > > >
> > > > Many technical and architectural reasons that justify this project have
> > > > been shared but let me throw in my perspective. There is nothing wrong
> >
> > with
> > > > H2, that was the right choice for that time.  Thanks to H2 and Ignite
> >
> > SQL
> > > > APIs, our project is used across hundreds of deployments who are
> > > > accelerating relational databases or use Ignite as a system of records.
> > > > However, these days many more companies are migrating to *distributed*
> > > > databases that speak SQL. For instance, if a couple of years ago 1 out
> >
> > of
> > > > 10 use cases needed support for multi-joins queries or queries with
> > > > subselects or efficient memory usage then today there are 5 out of 10
> >
> > use
> > > > cases of this kind; in the foreseeable future, it will be a 10 out of
> >
> > 10.
> > > > So, the evolution is in progress -- the relational world goes
> >
> > distributed,
> > > > it became exhaustive for both Ignite SQL maintainers and experts who
> >
> > help
> > > > to tune it for production usage to keep pace with the evolution mostly
> >
> > due
> > > > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be
> >
> > ready to
> > > > face the future reality.
> > > >
> > > > Luckily, we don't need to rush and don't have the right to rush because
> > > > hundreds existing users have already trusted their production
> >
> > environments
> > > > to Ignite SQL and we need to roll out changes with such a big impact
> > > > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped in
> >
> > and
> > > > agreed to be the first contributors who will be *experimenting* with
> >
> > the
> > > > new SQL engine. Let's support them; let's connect them with Apache
> >
> > Calcite
> > > > community and see how this story evolves.  Folks, please keep the
> >
> > community
> > > > aware of the progress, let us know when help is needed, some of us
> >
> > will be
> > > > ready to support with development once you create a solid foundation
> >
> > for
> > > > the prototype.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <
> >
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Hi Igniters!
> > > > >
> > > > > As you might know currently we have many open issues relating to
> >
> > current
> > > > > H2 based engine and its execution flow.
> > > > >
> > > > > Some of them are critical (like impossibility to execute particular
> > > > > queries), some of them are majors (like impossibility to execute
> >
> > particular
> > > > > queries without pre-preparation your data to have a collocation) and
> >
> > many
> > > > > minors.
> > > > >
> > > > > Most of the issues cannot be solved without whole engine redesign.
> > > > >
> > > > > So, here the proposal:
> > > > >
> >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > > > >
> > > > > I'll appreciate if you share your thoughts on top of that.
> > > > >
> > > > > Regards,
> > > > > Igor
> > > > >
> >
> >
> >
> > --
> > Best regards,
> > Ivan Pavlukhin
> >

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

gvvinblade
Nikolay,

The document you edited is wrong (and outdated).

Since the author meant another idea, I decided not to change IEP-35 and
create a new one - IEP-37 (https://cwiki.apache.org/confluence/x/NBLABw).
It's already have a number of key requirements.

Regards,
Igor

вт, 1 окт. 2019 г., 6:14 Nikolay Izhikov <[hidden email]>:

> Hello, Igniters.
>
> I extends IEP [1] with the tickets caused by H2 limitations.
>
> Please, let's write down requirements for engine in the IEP.
>
>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-33%3A+New+SQL+executor+engine+infrastructure
>
> В Пн, 30/09/2019 в 17:20 -0700, Denis Magda пишет:
> > Ivan, we need more of these discussions, totally agree with you ;)
> >
> > I've updated the Motivation paragraph outlining some high-level users we
> > see by working with our users. Hope it helps. Let's carry on and let me
> > send a note to Apache Calcite community.
> >
> > -
> > Denis
> >
> >
> > On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin <[hidden email]>
> wrote:
> >
> > > Folks,
> > >
> > > Thanks everyone for a hot discussion! Not every open source community
> > > has such open and boiling discussions. It means that people here
> > > really do care. And I am proud of it!
> > >
> > > As I understood, nobody is strictly against the proposed initiative.
> > > And I am glad that we can move forward (with some steps back along the
> > > way).
> > >
> > > пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:
> > > >
> > > > Hello, Denis.
> > > >
> > > > Thanks for the clarifications.
> > > >
> > > > Sounds good for me.
> > > > All I try to say in this thread:
> > > > Guys, please, let's take a step back and write down
> requirements(what we
> > >
> > > want to get with SQL engine).
> > > > Which features and use-cases are primary for us.
> > > >
> > > > I'm sure you have done it, already during your research.
> > > >
> > > > Please, share it with the community.
> > > >
> > > > I'm pretty sure we would back to this document again and again during
> > >
> > > migration.
> > > > So good written design is worth it.
> > > >
> > > > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > > > Ignite mates, let me try to move the discussion in a constructive
> way.
> > >
> > > It
> > > > > looks like we set a wrong context from the very beginning.
> > > > >
> > > > > Before proposing this idea to the community, some of us were
> > > > > discussing/researching the topic in different groups (the one need
> to
> > >
> > > think
> > > > > it through first before even suggesting to consider changes of this
> > > > > magnitude). The day has come to share this idea with the whole
> > >
> > > community
> > > > > and outline the next actions. But (!) nobody is 100% sure that
> that's
> > >
> > > the
> > > > > right decision. Thus, this will be an *experiment*, some of our
> > >
> > > community
> > > > > members will be developing a *prototype* and only based on the
> > >
> > > prototype
> > > > > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey,
> > >
> > > hope
> > > > > that nothing has changed and we're on the same page here.
> > > > >
> > > > > Many technical and architectural reasons that justify this project
> have
> > > > > been shared but let me throw in my perspective. There is nothing
> wrong
> > >
> > > with
> > > > > H2, that was the right choice for that time.  Thanks to H2 and
> Ignite
> > >
> > > SQL
> > > > > APIs, our project is used across hundreds of deployments who are
> > > > > accelerating relational databases or use Ignite as a system of
> records.
> > > > > However, these days many more companies are migrating to
> *distributed*
> > > > > databases that speak SQL. For instance, if a couple of years ago 1
> out
> > >
> > > of
> > > > > 10 use cases needed support for multi-joins queries or queries with
> > > > > subselects or efficient memory usage then today there are 5 out of
> 10
> > >
> > > use
> > > > > cases of this kind; in the foreseeable future, it will be a 10 out
> of
> > >
> > > 10.
> > > > > So, the evolution is in progress -- the relational world goes
> > >
> > > distributed,
> > > > > it became exhaustive for both Ignite SQL maintainers and experts
> who
> > >
> > > help
> > > > > to tune it for production usage to keep pace with the evolution
> mostly
> > >
> > > due
> > > > > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be
> > >
> > > ready to
> > > > > face the future reality.
> > > > >
> > > > > Luckily, we don't need to rush and don't have the right to rush
> because
> > > > > hundreds existing users have already trusted their production
> > >
> > > environments
> > > > > to Ignite SQL and we need to roll out changes with such a big
> impact
> > > > > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped
> in
> > >
> > > and
> > > > > agreed to be the first contributors who will be *experimenting*
> with
> > >
> > > the
> > > > > new SQL engine. Let's support them; let's connect them with Apache
> > >
> > > Calcite
> > > > > community and see how this story evolves.  Folks, please keep the
> > >
> > > community
> > > > > aware of the progress, let us know when help is needed, some of us
> > >
> > > will be
> > > > > ready to support with development once you create a solid
> foundation
> > >
> > > for
> > > > > the prototype.
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <
> > >
> > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > Hi Igniters!
> > > > > >
> > > > > > As you might know currently we have many open issues relating to
> > >
> > > current
> > > > > > H2 based engine and its execution flow.
> > > > > >
> > > > > > Some of them are critical (like impossibility to execute
> particular
> > > > > > queries), some of them are majors (like impossibility to execute
> > >
> > > particular
> > > > > > queries without pre-preparation your data to have a collocation)
> and
> > >
> > > many
> > > > > > minors.
> > > > > >
> > > > > > Most of the issues cannot be solved without whole engine
> redesign.
> > > > > >
> > > > > > So, here the proposal:
> > > > > >
> > >
> > >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > > > > >
> > > > > > I'll appreciate if you share your thoughts on top of that.
> > > > > >
> > > > > > Regards,
> > > > > > Igor
> > > > > >
> > >
> > >
> > >
> > > --
> > > Best regards,
> > > Ivan Pavlukhin
> > >
>
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Ivan Pavlukhin
Folks,

I marked IEP-33 as obsolete. Also now the IEP-37 we currently are
working with has a pretty URL
https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine

вт, 1 окт. 2019 г. в 11:17, Seliverstov Igor <[hidden email]>:

>
> Nikolay,
>
> The document you edited is wrong (and outdated).
>
> Since the author meant another idea, I decided not to change IEP-35 and
> create a new one - IEP-37 (https://cwiki.apache.org/confluence/x/NBLABw).
> It's already have a number of key requirements.
>
> Regards,
> Igor
>
> вт, 1 окт. 2019 г., 6:14 Nikolay Izhikov <[hidden email]>:
>
> > Hello, Igniters.
> >
> > I extends IEP [1] with the tickets caused by H2 limitations.
> >
> > Please, let's write down requirements for engine in the IEP.
> >
> >
> > https://cwiki.apache.org/confluence/display/IGNITE/IEP-33%3A+New+SQL+executor+engine+infrastructure
> >
> > В Пн, 30/09/2019 в 17:20 -0700, Denis Magda пишет:
> > > Ivan, we need more of these discussions, totally agree with you ;)
> > >
> > > I've updated the Motivation paragraph outlining some high-level users we
> > > see by working with our users. Hope it helps. Let's carry on and let me
> > > send a note to Apache Calcite community.
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin <[hidden email]>
> > wrote:
> > >
> > > > Folks,
> > > >
> > > > Thanks everyone for a hot discussion! Not every open source community
> > > > has such open and boiling discussions. It means that people here
> > > > really do care. And I am proud of it!
> > > >
> > > > As I understood, nobody is strictly against the proposed initiative.
> > > > And I am glad that we can move forward (with some steps back along the
> > > > way).
> > > >
> > > > пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:
> > > > >
> > > > > Hello, Denis.
> > > > >
> > > > > Thanks for the clarifications.
> > > > >
> > > > > Sounds good for me.
> > > > > All I try to say in this thread:
> > > > > Guys, please, let's take a step back and write down
> > requirements(what we
> > > >
> > > > want to get with SQL engine).
> > > > > Which features and use-cases are primary for us.
> > > > >
> > > > > I'm sure you have done it, already during your research.
> > > > >
> > > > > Please, share it with the community.
> > > > >
> > > > > I'm pretty sure we would back to this document again and again during
> > > >
> > > > migration.
> > > > > So good written design is worth it.
> > > > >
> > > > > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > > > > Ignite mates, let me try to move the discussion in a constructive
> > way.
> > > >
> > > > It
> > > > > > looks like we set a wrong context from the very beginning.
> > > > > >
> > > > > > Before proposing this idea to the community, some of us were
> > > > > > discussing/researching the topic in different groups (the one need
> > to
> > > >
> > > > think
> > > > > > it through first before even suggesting to consider changes of this
> > > > > > magnitude). The day has come to share this idea with the whole
> > > >
> > > > community
> > > > > > and outline the next actions. But (!) nobody is 100% sure that
> > that's
> > > >
> > > > the
> > > > > > right decision. Thus, this will be an *experiment*, some of our
> > > >
> > > > community
> > > > > > members will be developing a *prototype* and only based on the
> > > >
> > > > prototype
> > > > > > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey,
> > > >
> > > > hope
> > > > > > that nothing has changed and we're on the same page here.
> > > > > >
> > > > > > Many technical and architectural reasons that justify this project
> > have
> > > > > > been shared but let me throw in my perspective. There is nothing
> > wrong
> > > >
> > > > with
> > > > > > H2, that was the right choice for that time.  Thanks to H2 and
> > Ignite
> > > >
> > > > SQL
> > > > > > APIs, our project is used across hundreds of deployments who are
> > > > > > accelerating relational databases or use Ignite as a system of
> > records.
> > > > > > However, these days many more companies are migrating to
> > *distributed*
> > > > > > databases that speak SQL. For instance, if a couple of years ago 1
> > out
> > > >
> > > > of
> > > > > > 10 use cases needed support for multi-joins queries or queries with
> > > > > > subselects or efficient memory usage then today there are 5 out of
> > 10
> > > >
> > > > use
> > > > > > cases of this kind; in the foreseeable future, it will be a 10 out
> > of
> > > >
> > > > 10.
> > > > > > So, the evolution is in progress -- the relational world goes
> > > >
> > > > distributed,
> > > > > > it became exhaustive for both Ignite SQL maintainers and experts
> > who
> > > >
> > > > help
> > > > > > to tune it for production usage to keep pace with the evolution
> > mostly
> > > >
> > > > due
> > > > > > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be
> > > >
> > > > ready to
> > > > > > face the future reality.
> > > > > >
> > > > > > Luckily, we don't need to rush and don't have the right to rush
> > because
> > > > > > hundreds existing users have already trusted their production
> > > >
> > > > environments
> > > > > > to Ignite SQL and we need to roll out changes with such a big
> > impact
> > > > > > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped
> > in
> > > >
> > > > and
> > > > > > agreed to be the first contributors who will be *experimenting*
> > with
> > > >
> > > > the
> > > > > > new SQL engine. Let's support them; let's connect them with Apache
> > > >
> > > > Calcite
> > > > > > community and see how this story evolves.  Folks, please keep the
> > > >
> > > > community
> > > > > > aware of the progress, let us know when help is needed, some of us
> > > >
> > > > will be
> > > > > > ready to support with development once you create a solid
> > foundation
> > > >
> > > > for
> > > > > > the prototype.
> > > > > >
> > > > > > -
> > > > > > Denis
> > > > > >
> > > > > >
> > > > > > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <
> > > >
> > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Igniters!
> > > > > > >
> > > > > > > As you might know currently we have many open issues relating to
> > > >
> > > > current
> > > > > > > H2 based engine and its execution flow.
> > > > > > >
> > > > > > > Some of them are critical (like impossibility to execute
> > particular
> > > > > > > queries), some of them are majors (like impossibility to execute
> > > >
> > > > particular
> > > > > > > queries without pre-preparation your data to have a collocation)
> > and
> > > >
> > > > many
> > > > > > > minors.
> > > > > > >
> > > > > > > Most of the issues cannot be solved without whole engine
> > redesign.
> > > > > > >
> > > > > > > So, here the proposal:
> > > > > > >
> > > >
> > > >
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > > > > > >
> > > > > > > I'll appreciate if you share your thoughts on top of that.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Igor
> > > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > > Ivan Pavlukhin
> > > >
> >



--
Best regards,
Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Ivan Pavlukhin
Nikolay,

Guys updated IEP [1]. Could you please check it? Are there any missing
parts needed at that stage?

[1] https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine

вт, 1 окт. 2019 г. в 12:19, Ivan Pavlukhin <[hidden email]>:

>
> Folks,
>
> I marked IEP-33 as obsolete. Also now the IEP-37 we currently are
> working with has a pretty URL
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-37%3A+New+query+execution+engine
>
> вт, 1 окт. 2019 г. в 11:17, Seliverstov Igor <[hidden email]>:
> >
> > Nikolay,
> >
> > The document you edited is wrong (and outdated).
> >
> > Since the author meant another idea, I decided not to change IEP-35 and
> > create a new one - IEP-37 (https://cwiki.apache.org/confluence/x/NBLABw).
> > It's already have a number of key requirements.
> >
> > Regards,
> > Igor
> >
> > вт, 1 окт. 2019 г., 6:14 Nikolay Izhikov <[hidden email]>:
> >
> > > Hello, Igniters.
> > >
> > > I extends IEP [1] with the tickets caused by H2 limitations.
> > >
> > > Please, let's write down requirements for engine in the IEP.
> > >
> > >
> > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-33%3A+New+SQL+executor+engine+infrastructure
> > >
> > > В Пн, 30/09/2019 в 17:20 -0700, Denis Magda пишет:
> > > > Ivan, we need more of these discussions, totally agree with you ;)
> > > >
> > > > I've updated the Motivation paragraph outlining some high-level users we
> > > > see by working with our users. Hope it helps. Let's carry on and let me
> > > > send a note to Apache Calcite community.
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Mon, Sep 30, 2019 at 1:56 AM Ivan Pavlukhin <[hidden email]>
> > > wrote:
> > > >
> > > > > Folks,
> > > > >
> > > > > Thanks everyone for a hot discussion! Not every open source community
> > > > > has such open and boiling discussions. It means that people here
> > > > > really do care. And I am proud of it!
> > > > >
> > > > > As I understood, nobody is strictly against the proposed initiative.
> > > > > And I am glad that we can move forward (with some steps back along the
> > > > > way).
> > > > >
> > > > > пт, 27 сент. 2019 г. в 19:29, Nikolay Izhikov <[hidden email]>:
> > > > > >
> > > > > > Hello, Denis.
> > > > > >
> > > > > > Thanks for the clarifications.
> > > > > >
> > > > > > Sounds good for me.
> > > > > > All I try to say in this thread:
> > > > > > Guys, please, let's take a step back and write down
> > > requirements(what we
> > > > >
> > > > > want to get with SQL engine).
> > > > > > Which features and use-cases are primary for us.
> > > > > >
> > > > > > I'm sure you have done it, already during your research.
> > > > > >
> > > > > > Please, share it with the community.
> > > > > >
> > > > > > I'm pretty sure we would back to this document again and again during
> > > > >
> > > > > migration.
> > > > > > So good written design is worth it.
> > > > > >
> > > > > > В Пт, 27/09/2019 в 09:10 -0700, Denis Magda пишет:
> > > > > > > Ignite mates, let me try to move the discussion in a constructive
> > > way.
> > > > >
> > > > > It
> > > > > > > looks like we set a wrong context from the very beginning.
> > > > > > >
> > > > > > > Before proposing this idea to the community, some of us were
> > > > > > > discussing/researching the topic in different groups (the one need
> > > to
> > > > >
> > > > > think
> > > > > > > it through first before even suggesting to consider changes of this
> > > > > > > magnitude). The day has come to share this idea with the whole
> > > > >
> > > > > community
> > > > > > > and outline the next actions. But (!) nobody is 100% sure that
> > > that's
> > > > >
> > > > > the
> > > > > > > right decision. Thus, this will be an *experiment*, some of our
> > > > >
> > > > > community
> > > > > > > members will be developing a *prototype* and only based on the
> > > > >
> > > > > prototype
> > > > > > > outcomes we shall make a final decision. Igor, Roman, Ivan, Andrey,
> > > > >
> > > > > hope
> > > > > > > that nothing has changed and we're on the same page here.
> > > > > > >
> > > > > > > Many technical and architectural reasons that justify this project
> > > have
> > > > > > > been shared but let me throw in my perspective. There is nothing
> > > wrong
> > > > >
> > > > > with
> > > > > > > H2, that was the right choice for that time.  Thanks to H2 and
> > > Ignite
> > > > >
> > > > > SQL
> > > > > > > APIs, our project is used across hundreds of deployments who are
> > > > > > > accelerating relational databases or use Ignite as a system of
> > > records.
> > > > > > > However, these days many more companies are migrating to
> > > *distributed*
> > > > > > > databases that speak SQL. For instance, if a couple of years ago 1
> > > out
> > > > >
> > > > > of
> > > > > > > 10 use cases needed support for multi-joins queries or queries with
> > > > > > > subselects or efficient memory usage then today there are 5 out of
> > > 10
> > > > >
> > > > > use
> > > > > > > cases of this kind; in the foreseeable future, it will be a 10 out
> > > of
> > > > >
> > > > > 10.
> > > > > > > So, the evolution is in progress -- the relational world goes
> > > > >
> > > > > distributed,
> > > > > > > it became exhaustive for both Ignite SQL maintainers and experts
> > > who
> > > > >
> > > > > help
> > > > > > > to tune it for production usage to keep pace with the evolution
> > > mostly
> > > > >
> > > > > due
> > > > > > > to the H2-dependency. Thus, Ignite SQL has to evolve and has to be
> > > > >
> > > > > ready to
> > > > > > > face the future reality.
> > > > > > >
> > > > > > > Luckily, we don't need to rush and don't have the right to rush
> > > because
> > > > > > > hundreds existing users have already trusted their production
> > > > >
> > > > > environments
> > > > > > > to Ignite SQL and we need to roll out changes with such a big
> > > impact
> > > > > > > carefully. So, I'm excited that Roman, Igor, Ivan, Andrey stepped
> > > in
> > > > >
> > > > > and
> > > > > > > agreed to be the first contributors who will be *experimenting*
> > > with
> > > > >
> > > > > the
> > > > > > > new SQL engine. Let's support them; let's connect them with Apache
> > > > >
> > > > > Calcite
> > > > > > > community and see how this story evolves.  Folks, please keep the
> > > > >
> > > > > community
> > > > > > > aware of the progress, let us know when help is needed, some of us
> > > > >
> > > > > will be
> > > > > > > ready to support with development once you create a solid
> > > foundation
> > > > >
> > > > > for
> > > > > > > the prototype.
> > > > > > >
> > > > > > > -
> > > > > > > Denis
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Sep 27, 2019 at 1:45 AM Igor Seliverstov <
> > > > >
> > > > > [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Igniters!
> > > > > > > >
> > > > > > > > As you might know currently we have many open issues relating to
> > > > >
> > > > > current
> > > > > > > > H2 based engine and its execution flow.
> > > > > > > >
> > > > > > > > Some of them are critical (like impossibility to execute
> > > particular
> > > > > > > > queries), some of them are majors (like impossibility to execute
> > > > >
> > > > > particular
> > > > > > > > queries without pre-preparation your data to have a collocation)
> > > and
> > > > >
> > > > > many
> > > > > > > > minors.
> > > > > > > >
> > > > > > > > Most of the issues cannot be solved without whole engine
> > > redesign.
> > > > > > > >
> > > > > > > > So, here the proposal:
> > > > > > > >
> > > > >
> > > > >
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=130028084
> > > > > > > >
> > > > > > > > I'll appreciate if you share your thoughts on top of that.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Igor
> > > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best regards,
> > > > > Ivan Pavlukhin
> > > > >
> > >
>
>
>
> --
> Best regards,
> Ivan Pavlukhin



--
Best regards,
Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

steve.hostettler@gmail.com
Dear all,

would it be possible to also have then // execution of sql queries on single
node with that approach?
My understanding is that, for the moment, the SQL queries a re
single-threaded for a given node if there is no affinity.

Best Regards



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Roman Kondakov
Hi Steve,

it is possible to execute queries in parallel even in the current
engine, see docs here [1]. And of course this feature should also be
available in the new engine, though it's architecture may be changed.

[1]
https://apacheignite.readme.io/v2.0/docs/sql-performance-and-debugging#query-parallelism


--
Kind Regards
Roman Kondakov

On 15.11.2019 12:53, [hidden email] wrote:

> Dear all,
>
> would it be possible to also have then // execution of sql queries on single
> node with that approach?
> My understanding is that, for the moment, the SQL queries a re
> single-threaded for a given node if there is no affinity.
>
> Best Regards
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

RE: New SQL execution engine

Hostettler, Steve
Hi Roman,

Actually it does not work as I expect it. Please see https://github.com/hostettler/igniteParallelQueries
Do mvn clean install and then java -jar target/ignite-parallel-query-1.0.0-SNAPSHOT-jar-with-dependencies.jar

This demonstrates that with or without the flag the query does not return the same result. I understand that it probably because I did not set an affinity but it is very counter-intuitive.

Am I missing something?

-----Original Message-----
From: Roman Kondakov <[hidden email]>
Sent: Friday, November 15, 2019 11:46 AM
To: [hidden email]
Subject: Re: New SQL execution engine

Hi Steve,

it is possible to execute queries in parallel even in the current engine, see docs here [1]. And of course this feature should also be available in the new engine, though it's architecture may be changed.

[1]
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fapacheignite.readme.io%2Fv2.0%2Fdocs%2Fsql-performance-and-debugging%23query-parallelism&amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C2b752425baeb422af60408d769b9159d%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637094115967087030&amp;sdata=eN7b2RCJegg8J9KQVK6TIFhcS6NG7j5pWKFxX9GWyYk%3D&amp;reserved=0


--
Kind Regards
Roman Kondakov

On 15.11.2019 12:53, [hidden email] wrote:

> Dear all,
>
> would it be possible to also have then // execution of sql queries on single
> node with that approach?
> My understanding is that, for the moment, the SQL queries a re
> single-threaded for a given node if there is no affinity.
>
> Best Regards
>
>
>
> --
> Sent from: https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapache-ignite-developers.2346864.n4.nabble.com%2F&amp;data=02%7C01%7CSteve.Hostettler%40wolterskluwer.com%7C2b752425baeb422af60408d769b9159d%7C8ac76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637094115967087030&amp;sdata=jXtLMt2dWYqM4KcRFkw4lby6K0o8glKnrLFgxZ96LbQ%3D&amp;reserved=0
Reply | Threaded
Open this post in threaded view
|

RE: New SQL execution engine

steve.hostettler@gmail.com
Actually I am now wondering whether this is not just a bug and that I should
record it as such. As the behavior is different with and without the
parallelism and there is no warning during execution or in the api.

Any thought?



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Ivan Pavlukhin
Steve,

Yep, unfortunately query parallelism in current flavor is
counter-intuitive. But it was designed so =( As Roman wrote
> And of course this feature should also be available in the new engine, though it's architecture may be changed.
The architecture of parallel execution will be definitely
reconsidered. And currently we are targeted to do it so in one node
cluster query will return the same results regardless parallelism.

сб, 16 нояб. 2019 г. в 12:48, [hidden email]
<[hidden email]>:

>
> Actually I am now wondering whether this is not just a bug and that I should
> record it as such. As the behavior is different with and without the
> parallelism and there is no warning during execution or in the api.
>
> Any thought?
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/



--
Best regards,
Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

RE: New SQL execution engine

Hostettler, Steve
Ivan,

Thanks that is good news. I use ignite as a platform and not directly to exec in-house application so these types of things are making the generic code less  generic 😊.

Thanks a lot for the great work.

-----Original Message-----
From: Ivan Pavlukhin <[hidden email]>
Sent: Monday, November 18, 2019 10:13 AM
To: dev <[hidden email]>
Subject: Re: New SQL execution engine

Steve,

Yep, unfortunately query parallelism in current flavor is counter-intuitive. But it was designed so =( As Roman wrote
> And of course this feature should also be available in the new engine, though it's architecture may be changed.
The architecture of parallel execution will be definitely reconsidered. And currently we are targeted to do it so in one node cluster query will return the same results regardless parallelism.

сб, 16 нояб. 2019 г. в 12:48, [hidden email]
<[hidden email]>:

>
> Actually I am now wondering whether this is not just a bug and that I
> should record it as such. As the behavior is different with and
> without the parallelism and there is no warning during execution or in the api.
>
> Any thought?
>
>
>
> --
> Sent from:
> https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fapach
> e-ignite-developers.2346864.n4.nabble.com%2F&amp;data=02%7C01%7CSteve.
> Hostettler%40wolterskluwer.com%7Cac6000fb14834d1abfa108d76c079273%7C8a
> c76c91e7f141ffa89c3553b2da2c17%7C0%7C0%7C637096652092270800&amp;sdata=
> PcitGXmdx5DittW1RMAOEeneiLfKVrydUHL8uCKGi3g%3D&amp;reserved=0



--
Best regards,
Ivan Pavlukhin
Reply | Threaded
Open this post in threaded view
|

Re: New SQL execution engine

Roman Kondakov
In reply to this post by steve.hostettler@gmail.com
Hi, Steve

This behavior is actually not a bug, but this is not obvious. I'll try
to explain.

When query parallelism = N is turned on, it means that each cache is
divided into N parts from the SQL point of view. Every SQL query is
executed independently over each particular part, and then results are
merged together during the reducer step.

This is absolutely identical to the distributed query execution, where
instead of a single node with query parallelism = N, we have N nodes
with query parallelism = 1. SQL query is executed over each partition of
data on all nodes and then results are merged on reducer.

As we can see, query parallelism is equivalent to the distributed query
execution. When we do joins over distributed tables, we need to think
about the collocation of data [1]. If data is not collocated, we get a
wrong result. This happens silently, which is not good, IMO.

I reworked your example a bit in order to impose collocation on the
joining key and now join returns correct result [2].

Current approach in configuration and query execution looks very
uncomfortable and should be completely redesigned in the new engine.

[1] https://apacheignite-sql.readme.io/docs/distributed-joins

[2] https://github.com/hostettler/igniteParallelQueries/pull/1


--
Kind Regards
Roman Kondakov

On 16.11.2019 12:50, [hidden email] wrote:

> Actually I am now wondering whether this is not just a bug and that I should
> record it as such. As the behavior is different with and without the
> parallelism and there is no warning during execution or in the api.
>
> Any thought?
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
123