[SparkDataFrame] Query Optimization. Prototype

classic Classic list List threaded Threaded
27 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov
Hello, guys.

I have done prototype of implementation of optimization query from Spark to Ignite [1].

Please, take a look at PR [2].

But still there are some issues I want to clarify with community:

    * Can I know scale and precision of DECIMAL column?
      Example - [3]

    * Ignite have some limitation for a *distributed* join. For example, we can execute join only for indexed columns. Example - [4].
        * Do we have documentation for all limitations of distributed join implementation?
        * How can I know whether concrete syntactically correct SQL query with join can be executed by Ignite or not?

[1] https://issues.apache.org/jira/browse/IGNITE-7077
[2] https://github.com/apache/ignite/pull/3397
[3] https://gist.github.com/nizhikov/2d67ed68afd473d7251dab7d79135513
[4] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Valentin Kulichenko
Hi Nikolay,

I will review this sometime next week.

-Val

On Fri, Jan 19, 2018 at 2:44 AM, Nikolay Izhikov <[hidden email]>
wrote:

> Hello, guys.
>
> I have done prototype of implementation of optimization query from Spark
> to Ignite [1].
>
> Please, take a look at PR [2].
>
> But still there are some issues I want to clarify with community:
>
>     * Can I know scale and precision of DECIMAL column?
>       Example - [3]
>
>     * Ignite have some limitation for a *distributed* join. For example,
> we can execute join only for indexed columns. Example - [4].
>         * Do we have documentation for all limitations of distributed join
> implementation?
>         * How can I know whether concrete syntactically correct SQL query
> with join can be executed by Ignite or not?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-7077
> [2] https://github.com/apache/ignite/pull/3397
> [3] https://gist.github.com/nizhikov/2d67ed68afd473d7251dab7d79135513
> [4] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov
Hello, Val.

Thank you.
Looking forward for your review.

One more issues with Ignite SQL:

Do we have description of CAST function?
I want to find out which types that can be converted from one to other.

Something like following table for a MsSQL [1]

Should I wrote my questions about Ignite SQL in another thread?

[1] https://docs.microsoft.com/ru-ru/sql/t-sql/functions/cast-and-convert-transact-sql


В Пт, 19/01/2018 в 14:18 -0800, Valentin Kulichenko пишет:

> Hi Nikolay,
>
> I will review this sometime next week.
>
> -Val
>
> On Fri, Jan 19, 2018 at 2:44 AM, Nikolay Izhikov <[hidden email]> wrote:
> > Hello, guys.
> >
> > I have done prototype of implementation of optimization query from Spark to Ignite [1].
> >
> > Please, take a look at PR [2].
> >
> > But still there are some issues I want to clarify with community:
> >
> >     * Can I know scale and precision of DECIMAL column?
> >       Example - [3]
> >
> >     * Ignite have some limitation for a *distributed* join. For example, we can execute join only for indexed columns. Example - [4].
> >         * Do we have documentation for all limitations of distributed join implementation?
> >         * How can I know whether concrete syntactically correct SQL query with join can be executed by Ignite or not?
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-7077
> > [2] https://github.com/apache/ignite/pull/3397
> > [3] https://gist.github.com/nizhikov/2d67ed68afd473d7251dab7d79135513
> > [4] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Denis Magda
I see that CAST is supported by H2 but not presented among Ignite system
functions:
https://apacheignite-sql.readme.io/docs/system-functions

Serge, is this an oversight on our doc side? H2 system functions scope is
much broader in general. Could you check this up?

Denis

On Sunday, January 21, 2018, Nikolay Izhikov <[hidden email]> wrote:

> Hello, Val.
>
> Thank you.
> Looking forward for your review.
>
> One more issues with Ignite SQL:
>
> Do we have description of CAST function?
> I want to find out which types that can be converted from one to other.
>
> Something like following table for a MsSQL [1]
>
> Should I wrote my questions about Ignite SQL in another thread?
>
> [1] https://docs.microsoft.com/ru-ru/sql/t-sql/functions/cast-
> and-convert-transact-sql
>
>
> В Пт, 19/01/2018 в 14:18 -0800, Valentin Kulichenko пишет:
> > Hi Nikolay,
> >
> > I will review this sometime next week.
> >
> > -Val
> >
> > On Fri, Jan 19, 2018 at 2:44 AM, Nikolay Izhikov <[hidden email]>
> wrote:
> > > Hello, guys.
> > >
> > > I have done prototype of implementation of optimization query from
> Spark to Ignite [1].
> > >
> > > Please, take a look at PR [2].
> > >
> > > But still there are some issues I want to clarify with community:
> > >
> > >     * Can I know scale and precision of DECIMAL column?
> > >       Example - [3]
> > >
> > >     * Ignite have some limitation for a *distributed* join. For
> example, we can execute join only for indexed columns. Example - [4].
> > >         * Do we have documentation for all limitations of distributed
> join implementation?
> > >         * How can I know whether concrete syntactically correct SQL
> query with join can be executed by Ignite or not?
> > >
> > > [1] https://issues.apache.org/jira/browse/IGNITE-7077
> > > [2] https://github.com/apache/ignite/pull/3397
> > > [3] https://gist.github.com/nizhikov/2d67ed68afd473d7251dab7d79135513
> > > [4] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Serge Puchnin
CONTENTS DELETED
The author has deleted this message.
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov
Serge, thank you!
Vladimir, guys, can you take a look at another questions:


    * Can I know scale and precision of DECIMAL column?
      Example - [3]

    * Ignite have some limitation for a *distributed* join. For example, we can execute join only for indexed columns. Example - [4].
        * Do we have documentation for all limitations of distributed join implementation?
        * How can I know whether concrete syntactically correct SQL query with join can be executed by Ignite or not?




[3] https://gist.github.com/nizhikov/2d67ed68afd473d7251dab7d79135513
[4] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b

В Вт, 23/01/2018 в 01:05 -0700, Serge Puchnin пишет:

> yes, the Cust function is supporting both Ignite and H2.
>
> I've updated the documentation for next system functions:
> CASEWHEN Function, CAST, CONVERT, TABLE
>
> https://apacheignite-sql.readme.io/docs/system-functions
>
> And for my mind, next functions aren't applicable for Ignite:
> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> H2VERSION
>
> Also an issue was created for review current documentation:
> https://issues.apache.org/jira/browse/IGNITE-7496
>
> --
> BR,
> Serge
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

dsetrakyan
In reply to this post by Serge Puchnin
I think we need a page listing the unsupported functions with explanation
why, which is either it does not make sense in Ignite or is planned in
future release.

Sergey, do you think you will be able to do it?

D.

On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
wrote:

> yes, the Cust function is supporting both Ignite and H2.
>
> I've updated the documentation for next system functions:
> CASEWHEN Function, CAST, CONVERT, TABLE
>
> https://apacheignite-sql.readme.io/docs/system-functions
>
> And for my mind, next functions aren't applicable for Ignite:
> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> H2VERSION
>
> Also an issue was created for review current documentation:
> https://issues.apache.org/jira/browse/IGNITE-7496
>
> --
> BR,
> Serge
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

dmagda
Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>

That’s on my radar. I’ll take care of this.


Denis

> On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
>
> I think we need a page listing the unsupported functions with explanation
> why, which is either it does not make sense in Ignite or is planned in
> future release.
>
> Sergey, do you think you will be able to do it?
>
> D.
>
> On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> wrote:
>
>> yes, the Cust function is supporting both Ignite and H2.
>>
>> I've updated the documentation for next system functions:
>> CASEWHEN Function, CAST, CONVERT, TABLE
>>
>> https://apacheignite-sql.readme.io/docs/system-functions
>>
>> And for my mind, next functions aren't applicable for Ignite:
>> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
>> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
>> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
>> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
>> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
>> H2VERSION
>>
>> Also an issue was created for review current documentation:
>> https://issues.apache.org/jira/browse/IGNITE-7496
>>
>> --
>> BR,
>> Serge
>>
>>
>>
>> --
>> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>>

Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Vladimir Ozerov
Hi Nikolay,

Could you please clarify your question about scale and precision? Do you
mean whether it is possible to read it from table metadata? If yes, it is
not possible at the moment unfortunately - we do not store information
about lengths, scales and precision, only actual data types are passed to
H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in
future, but I do not have any dates at the moment.

Now about joins - Denis, I think you provided wrong link to our internal
GridGain docs where we accumulate information about ANSI compatibility and
which will are going to publish on Ignite WIKI when it is ready. In any
case, this is not what Nikolay aksed about. The question was about
limitation of our joins which has nothing to do with ANSI standard.
Unfortunately, at this moment we do not have complete list of all
restrictions on our joins, because a lot of work is delegated to H2. In
some unsupported scenarios we throw an exception. In other cases we return
incorrect results silently (e.g. if you do not co-locate data and forgot to
set "distributed joins" flag). We have a plan to perform excessive testing
of joins (both co-located and distributed) and list all known limitations.
This would require writing a lot of unit tests to cover various scenarios.
I think we will have this information in a matter of 1-2 months.

Vladimir.

On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:

> Agree. The unsupported functions should be mentioned on the page that will
> cover Ignite ANSI-99 compliance. We have first results available for CORE
> features of the specification:
> https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99
> <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99
> >
>
> That’s on my radar. I’ll take care of this.
>
> —
> Denis
>
> > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]>
> wrote:
> >
> > I think we need a page listing the unsupported functions with explanation
> > why, which is either it does not make sense in Ignite or is planned in
> > future release.
> >
> > Sergey, do you think you will be able to do it?
> >
> > D.
> >
> > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> [hidden email]>
> > wrote:
> >
> >> yes, the Cust function is supporting both Ignite and H2.
> >>
> >> I've updated the documentation for next system functions:
> >> CASEWHEN Function, CAST, CONVERT, TABLE
> >>
> >> https://apacheignite-sql.readme.io/docs/system-functions
> >>
> >> And for my mind, next functions aren't applicable for Ignite:
> >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> >> H2VERSION
> >>
> >> Also an issue was created for review current documentation:
> >> https://issues.apache.org/jira/browse/IGNITE-7496
> >>
> >> --
> >> BR,
> >> Serge
> >>
> >>
> >>
> >> --
> >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov
Hello, Vladimir.

Thank you for an answer.

> Do you mean whether it is possible to read it from table metadata?

Yes, you are right.
I want to read scale and precision of DECIMAL column from table metadata.

> This will be fixed at some point in future, but I do not have any dates at the moment.

Is there ticket for it? I can't find it via jira search

> at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> In some unsupported scenarios we throw an exception.
> In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).

Guys, Val, may be we should exclude join optimization from IGNITE-7077 while we haven't all limitation on the hand?

> We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.

Can I help somehow with this activity?


В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:

> Hi Nikolay,
>
> Could you please clarify your question about scale and precision? Do you mean whether it is possible to read it from table metadata? If yes, it is not possible at the moment unfortunately - we do not store information about lengths, scales and precision, only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in future, but I do not have any dates at the moment.
>
> Now about joins - Denis, I think you provided wrong link to our internal GridGain docs where we accumulate information about ANSI compatibility and which will are going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed about. The question was about limitation of our joins which has nothing to do with ANSI standard. Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception. In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations. This would require writing a lot of unit tests to cover various scenarios. I think we will have this information in a matter of 1-2 months.
>
> Vladimir.
>
> On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
> > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> >
> > That’s on my radar. I’ll take care of this.
> >
> > —
> > Denis
> >
> > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
> > >
> > > I think we need a page listing the unsupported functions with explanation
> > > why, which is either it does not make sense in Ignite or is planned in
> > > future release.
> > >
> > > Sergey, do you think you will be able to do it?
> > >
> > > D.
> > >
> > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> > > wrote:
> > >
> > >> yes, the Cust function is supporting both Ignite and H2.
> > >>
> > >> I've updated the documentation for next system functions:
> > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > >>
> > >> https://apacheignite-sql.readme.io/docs/system-functions
> > >>
> > >> And for my mind, next functions aren't applicable for Ignite:
> > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> > >> H2VERSION
> > >>
> > >> Also an issue was created for review current documentation:
> > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > >>
> > >> --
> > >> BR,
> > >> Serge
> > >>
> > >>
> > >>
> > >> --
> > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > >>
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Vladimir Ozerov
Hi Nikolay,

I am not sure if ticket for DECIMAL column metadata exists. If you haven't
find one under "sql" component, please feel free to create it on your own.
As far as testing of joins, I think it makes sense to start working on it
when we finish ANSI compliance testing which is already in progress.

On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]>
wrote:

> Hello, Vladimir.
>
> Thank you for an answer.
>
> > Do you mean whether it is possible to read it from table metadata?
>
> Yes, you are right.
> I want to read scale and precision of DECIMAL column from table metadata.
>
> > This will be fixed at some point in future, but I do not have any dates
> at the moment.
>
> Is there ticket for it? I can't find it via jira search
>
> > at this moment we do not have complete list of all restrictions on our
> joins, because a lot of work is delegated to H2.
> > In some unsupported scenarios we throw an exception.
> > In other cases we return incorrect results silently (e.g. if you do not
> co-locate data and forgot to set "distributed joins" flag).
>
> Guys, Val, may be we should exclude join optimization from IGNITE-7077
> while we haven't all limitation on the hand?
>
> > We have a plan to perform excessive testing of joins (both co-located
> and distributed) and list all known limitations.
>
> Can I help somehow with this activity?
>
>
> В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > Hi Nikolay,
> >
> > Could you please clarify your question about scale and precision? Do you
> mean whether it is possible to read it from table metadata? If yes, it is
> not possible at the moment unfortunately - we do not store information
> about lengths, scales and precision, only actual data types are passed to
> H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in
> future, but I do not have any dates at the moment.
> >
> > Now about joins - Denis, I think you provided wrong link to our internal
> GridGain docs where we accumulate information about ANSI compatibility and
> which will are going to publish on Ignite WIKI when it is ready. In any
> case, this is not what Nikolay aksed about. The question was about
> limitation of our joins which has nothing to do with ANSI standard.
> Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2. In
> some unsupported scenarios we throw an exception. In other cases we return
> incorrect results silently (e.g. if you do not co-locate data and forgot to
> set "distributed joins" flag). We have a plan to perform excessive testing
> of joins (both co-located and distributed) and list all known limitations.
> This would require writing a lot of unit tests to cover various scenarios.
> I think we will have this information in a matter of 1-2 months.
> >
> > Vladimir.
> >
> > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > > Agree. The unsupported functions should be mentioned on the page that
> will cover Ignite ANSI-99 compliance. We have first results available for
> CORE features of the specification:
> > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/
> 45093646/ANSI+SQL+99 <https://ggsystems.atlassian.
> net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > >
> > > That’s on my radar. I’ll take care of this.
> > >
> > > —
> > > Denis
> > >
> > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
> [hidden email]> wrote:
> > > >
> > > > I think we need a page listing the unsupported functions with
> explanation
> > > > why, which is either it does not make sense in Ignite or is planned
> in
> > > > future release.
> > > >
> > > > Sergey, do you think you will be able to do it?
> > > >
> > > > D.
> > > >
> > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> [hidden email]>
> > > > wrote:
> > > >
> > > >> yes, the Cust function is supporting both Ignite and H2.
> > > >>
> > > >> I've updated the documentation for next system functions:
> > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > >>
> > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > >>
> > > >> And for my mind, next functions aren't applicable for Ignite:
> > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE,
> DATABASE,
> > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY,
> CURRVAL,
> > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE,
> USER,
> > > >> H2VERSION
> > > >>
> > > >> Also an issue was created for review current documentation:
> > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > >>
> > > >> --
> > > >> BR,
> > > >> Serge
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > >>
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Valentin Kulichenko
Nikolay,

When you're talking about join optimization, what exactly are you referring
to?

Since other parts of data frames integration are already merged, I think
it's a good time to resurrect this thread? Does it make sense to review it
right now? Or you want to make some more changes?

-Val

On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Hi Nikolay,
>
> I am not sure if ticket for DECIMAL column metadata exists. If you haven't
> find one under "sql" component, please feel free to create it on your own.
> As far as testing of joins, I think it makes sense to start working on it
> when we finish ANSI compliance testing which is already in progress.
>
> On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]>
> wrote:
>
>> Hello, Vladimir.
>>
>> Thank you for an answer.
>>
>> > Do you mean whether it is possible to read it from table metadata?
>>
>> Yes, you are right.
>> I want to read scale and precision of DECIMAL column from table metadata.
>>
>> > This will be fixed at some point in future, but I do not have any dates
>> at the moment.
>>
>> Is there ticket for it? I can't find it via jira search
>>
>> > at this moment we do not have complete list of all restrictions on our
>> joins, because a lot of work is delegated to H2.
>> > In some unsupported scenarios we throw an exception.
>> > In other cases we return incorrect results silently (e.g. if you do not
>> co-locate data and forgot to set "distributed joins" flag).
>>
>> Guys, Val, may be we should exclude join optimization from IGNITE-7077
>> while we haven't all limitation on the hand?
>>
>> > We have a plan to perform excessive testing of joins (both co-located
>> and distributed) and list all known limitations.
>>
>> Can I help somehow with this activity?
>>
>>
>> В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
>> > Hi Nikolay,
>> >
>> > Could you please clarify your question about scale and precision? Do
>> you mean whether it is possible to read it from table metadata? If yes, it
>> is not possible at the moment unfortunately - we do not store information
>> about lengths, scales and precision, only actual data types are passed to
>> H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in
>> future, but I do not have any dates at the moment.
>> >
>> > Now about joins - Denis, I think you provided wrong link to our
>> internal GridGain docs where we accumulate information about ANSI
>> compatibility and which will are going to publish on Ignite WIKI when it is
>> ready. In any case, this is not what Nikolay aksed about. The question was
>> about limitation of our joins which has nothing to do with ANSI standard.
>> Unfortunately, at this moment we do not have complete list of all
>> restrictions on our joins, because a lot of work is delegated to H2. In
>> some unsupported scenarios we throw an exception. In other cases we return
>> incorrect results silently (e.g. if you do not co-locate data and forgot to
>> set "distributed joins" flag). We have a plan to perform excessive testing
>> of joins (both co-located and distributed) and list all known limitations.
>> This would require writing a lot of unit tests to cover various scenarios.
>> I think we will have this information in a matter of 1-2 months.
>> >
>> > Vladimir.
>> >
>> > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]>
>> wrote:
>> > > Agree. The unsupported functions should be mentioned on the page that
>> will cover Ignite ANSI-99 compliance. We have first results available for
>> CORE features of the specification:
>> > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/4509364
>> 6/ANSI+SQL+99 <https://ggsystems.atlassian.n
>> et/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
>> > >
>> > > That’s on my radar. I’ll take care of this.
>> > >
>> > > —
>> > > Denis
>> > >
>> > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
>> [hidden email]> wrote:
>> > > >
>> > > > I think we need a page listing the unsupported functions with
>> explanation
>> > > > why, which is either it does not make sense in Ignite or is planned
>> in
>> > > > future release.
>> > > >
>> > > > Sergey, do you think you will be able to do it?
>> > > >
>> > > > D.
>> > > >
>> > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
>> [hidden email]>
>> > > > wrote:
>> > > >
>> > > >> yes, the Cust function is supporting both Ignite and H2.
>> > > >>
>> > > >> I've updated the documentation for next system functions:
>> > > >> CASEWHEN Function, CAST, CONVERT, TABLE
>> > > >>
>> > > >> https://apacheignite-sql.readme.io/docs/system-functions
>> > > >>
>> > > >> And for my mind, next functions aren't applicable for Ignite:
>> > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE,
>> DATABASE,
>> > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
>> > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY,
>> CURRVAL,
>> > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
>> > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE,
>> USER,
>> > > >> H2VERSION
>> > > >>
>> > > >> Also an issue was created for review current documentation:
>> > > >> https://issues.apache.org/jira/browse/IGNITE-7496
>> > > >>
>> > > >> --
>> > > >> BR,
>> > > >> Serge
>> > > >>
>> > > >>
>> > > >>
>> > > >> --
>> > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>> > > >>
>> > >
>> >
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

dmagda
In reply to this post by Vladimir Ozerov
The testing of the SQL 99 *Core* specification is done and this page will be released once 2.4 goes live:
https://apacheignite-sql.readme.io/v2.3/docs/sql-conformance

However, Wiki page is already live:
https://en.wikipedia.org/wiki/SQL_compliance


Denis

> On Feb 12, 2018, at 12:20 AM, Vladimir Ozerov <[hidden email]> wrote:
>
> As far as testing of joins, I think it makes sense to start working on it
> when we finish ANSI compliance testing which is already in progress.

Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov-2
In reply to this post by Vladimir Ozerov
Hello, Vladimir.

I've created ticket

https://issues.apache.org/jira/browse/IGNITE-7691

В Пн, 12/02/2018 в 11:20 +0300, Vladimir Ozerov пишет:

> Hi Nikolay,
>
> I am not sure if ticket for DECIMAL column metadata exists. If you haven't find one under "sql" component, please feel free to create it on your own. As far as testing of joins, I think it makes sense to start working on it when we finish ANSI compliance testing which is already in progress.
>
> On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]> wrote:
> > Hello, Vladimir.
> >
> > Thank you for an answer.
> >
> > > Do you mean whether it is possible to read it from table metadata?
> >
> > Yes, you are right.
> > I want to read scale and precision of DECIMAL column from table metadata.
> >
> > > This will be fixed at some point in future, but I do not have any dates at the moment.
> >
> > Is there ticket for it? I can't find it via jira search
> >
> > > at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> > > In some unsupported scenarios we throw an exception.
> > > In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> >
> > Guys, Val, may be we should exclude join optimization from IGNITE-7077 while we haven't all limitation on the hand?
> >
> > > We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> >
> > Can I help somehow with this activity?
> >
> >
> > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > Hi Nikolay,
> > >
> > > Could you please clarify your question about scale and precision? Do you mean whether it is possible to read it from table metadata? If yes, it is not possible at the moment unfortunately - we do not store information about lengths, scales and precision, only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in future, but I do not have any dates at the moment.
> > >
> > > Now about joins - Denis, I think you provided wrong link to our internal GridGain docs where we accumulate information about ANSI compatibility and which will are going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed about. The question was about limitation of our joins which has nothing to do with ANSI standard. Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception. In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations. This would require writing a lot of unit tests to cover various scenarios. I think we will have this information in a matter of 1-2 months.
> > >
> > > Vladimir.
> > >
> > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > > > Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
> > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > >
> > > > That’s on my radar. I’ll take care of this.
> > > >
> > > > —
> > > > Denis
> > > >
> > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
> > > > >
> > > > > I think we need a page listing the unsupported functions with explanation
> > > > > why, which is either it does not make sense in Ignite or is planned in
> > > > > future release.
> > > > >
> > > > > Sergey, do you think you will be able to do it?
> > > > >
> > > > > D.
> > > > >
> > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> > > > > wrote:
> > > > >
> > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > >>
> > > > >> I've updated the documentation for next system functions:
> > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > >>
> > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > >>
> > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> > > > >> H2VERSION
> > > > >>
> > > > >> Also an issue was created for review current documentation:
> > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > >>
> > > > >> --
> > > > >> BR,
> > > > >> Serge
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > >>
> > > >
> > >
> > >
>
>

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov-2
In reply to this post by Valentin Kulichenko
Hello, Valentin.

> Since other parts of data frames integration are already merged,
> I think it's a good time to resurrect this thread?
> Does it make sense to review it right now?
> Or you want to make some more changes?

I've already merged PR [1] with current master.
So you can review it, if you with.

But I need couple of days to double check all changes.
Extends scaladoc, etc.

I will write you when PR is fully ready.

[1] https://github.com/apache/ignite/pull/3397



В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:

> Nikolay,
>
> When you're talking about join optimization, what exactly are you referring to?
>
> Since other parts of data frames integration are already merged, I think it's a good time to resurrect this thread? Does it make sense to review it right now? Or you want to make some more changes?
>
> -Val
>
> On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]> wrote:
> > Hi Nikolay,
> >
> > I am not sure if ticket for DECIMAL column metadata exists. If you haven't find one under "sql" component, please feel free to create it on your own. As far as testing of joins, I think it makes sense to start working on it when we finish ANSI compliance testing which is already in progress.
> >
> > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]> wrote:
> > > Hello, Vladimir.
> > >
> > > Thank you for an answer.
> > >
> > > > Do you mean whether it is possible to read it from table metadata?
> > >
> > > Yes, you are right.
> > > I want to read scale and precision of DECIMAL column from table metadata.
> > >
> > > > This will be fixed at some point in future, but I do not have any dates at the moment.
> > >
> > > Is there ticket for it? I can't find it via jira search
> > >
> > > > at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> > > > In some unsupported scenarios we throw an exception.
> > > > In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> > >
> > > Guys, Val, may be we should exclude join optimization from IGNITE-7077 while we haven't all limitation on the hand?
> > >
> > > > We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> > >
> > > Can I help somehow with this activity?
> > >
> > >
> > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > Hi Nikolay,
> > > >
> > > > Could you please clarify your question about scale and precision? Do you mean whether it is possible to read it from table metadata? If yes, it is not possible at the moment unfortunately - we do not store information about lengths, scales and precision, only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in future, but I do not have any dates at the moment.
> > > >
> > > > Now about joins - Denis, I think you provided wrong link to our internal GridGain docs where we accumulate information about ANSI compatibility and which will are going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed about. The question was about limitation of our joins which has nothing to do with ANSI standard. Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception. In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations. This would require writing a lot of unit tests to cover various scenarios. I think we will have this information in a matter of 1-2 months.
> > > >
> > > > Vladimir.
> > > >
> > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > > > > Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
> > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > >
> > > > > That’s on my radar. I’ll take care of this.
> > > > >
> > > > > —
> > > > > Denis
> > > > >
> > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
> > > > > >
> > > > > > I think we need a page listing the unsupported functions with explanation
> > > > > > why, which is either it does not make sense in Ignite or is planned in
> > > > > > future release.
> > > > > >
> > > > > > Sergey, do you think you will be able to do it?
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > >>
> > > > > >> I've updated the documentation for next system functions:
> > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > >>
> > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > >>
> > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> > > > > >> H2VERSION
> > > > > >>
> > > > > >> Also an issue was created for review current documentation:
> > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > >>
> > > > > >> --
> > > > > >> BR,
> > > > > >> Serge
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > >>
> > > > >
> > > >
> > > >
>
>

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Valentin Kulichenko
Sounds good. Let me know when you feel it's ready and I'll take a look.

-Val

On Tue, Feb 13, 2018 at 7:56 AM, Nikolay Izhikov <[hidden email]>
wrote:

> Hello, Valentin.
>
> > Since other parts of data frames integration are already merged,
> > I think it's a good time to resurrect this thread?
> > Does it make sense to review it right now?
> > Or you want to make some more changes?
>
> I've already merged PR [1] with current master.
> So you can review it, if you with.
>
> But I need couple of days to double check all changes.
> Extends scaladoc, etc.
>
> I will write you when PR is fully ready.
>
> [1] https://github.com/apache/ignite/pull/3397
>
>
>
> В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> > Nikolay,
> >
> > When you're talking about join optimization, what exactly are you
> referring to?
> >
> > Since other parts of data frames integration are already merged, I think
> it's a good time to resurrect this thread? Does it make sense to review it
> right now? Or you want to make some more changes?
> >
> > -Val
> >
> > On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]>
> wrote:
> > > Hi Nikolay,
> > >
> > > I am not sure if ticket for DECIMAL column metadata exists. If you
> haven't find one under "sql" component, please feel free to create it on
> your own. As far as testing of joins, I think it makes sense to start
> working on it when we finish ANSI compliance testing which is already in
> progress.
> > >
> > > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <
> [hidden email]> wrote:
> > > > Hello, Vladimir.
> > > >
> > > > Thank you for an answer.
> > > >
> > > > > Do you mean whether it is possible to read it from table metadata?
> > > >
> > > > Yes, you are right.
> > > > I want to read scale and precision of DECIMAL column from table
> metadata.
> > > >
> > > > > This will be fixed at some point in future, but I do not have any
> dates at the moment.
> > > >
> > > > Is there ticket for it? I can't find it via jira search
> > > >
> > > > > at this moment we do not have complete list of all restrictions on
> our joins, because a lot of work is delegated to H2.
> > > > > In some unsupported scenarios we throw an exception.
> > > > > In other cases we return incorrect results silently (e.g. if you
> do not co-locate data and forgot to set "distributed joins" flag).
> > > >
> > > > Guys, Val, may be we should exclude join optimization from
> IGNITE-7077 while we haven't all limitation on the hand?
> > > >
> > > > > We have a plan to perform excessive testing of joins (both
> co-located and distributed) and list all known limitations.
> > > >
> > > > Can I help somehow with this activity?
> > > >
> > > >
> > > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > > Hi Nikolay,
> > > > >
> > > > > Could you please clarify your question about scale and precision?
> Do you mean whether it is possible to read it from table metadata? If yes,
> it is not possible at the moment unfortunately - we do not store
> information about lengths, scales and precision, only actual data types are
> passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some
> point in future, but I do not have any dates at the moment.
> > > > >
> > > > > Now about joins - Denis, I think you provided wrong link to our
> internal GridGain docs where we accumulate information about ANSI
> compatibility and which will are going to publish on Ignite WIKI when it is
> ready. In any case, this is not what Nikolay aksed about. The question was
> about limitation of our joins which has nothing to do with ANSI standard.
> Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2. In
> some unsupported scenarios we throw an exception. In other cases we return
> incorrect results silently (e.g. if you do not co-locate data and forgot to
> set "distributed joins" flag). We have a plan to perform excessive testing
> of joins (both co-located and distributed) and list all known limitations.
> This would require writing a lot of unit tests to cover various scenarios.
> I think we will have this information in a matter of 1-2 months.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]>
> wrote:
> > > > > > Agree. The unsupported functions should be mentioned on the page
> that will cover Ignite ANSI-99 compliance. We have first results available
> for CORE features of the specification:
> > > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/
> 45093646/ANSI+SQL+99 <https://ggsystems.atlassian.
> net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > > >
> > > > > > That’s on my radar. I’ll take care of this.
> > > > > >
> > > > > > —
> > > > > > Denis
> > > > > >
> > > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
> [hidden email]> wrote:
> > > > > > >
> > > > > > > I think we need a page listing the unsupported functions with
> explanation
> > > > > > > why, which is either it does not make sense in Ignite or is
> planned in
> > > > > > > future release.
> > > > > > >
> > > > > > > Sergey, do you think you will be able to do it?
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > > >>
> > > > > > >> I've updated the documentation for next system functions:
> > > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > > >>
> > > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > > >>
> > > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE,
> DATABASE,
> > > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE,
> LINK_SCHEMA,
> > > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY,
> CURRVAL,
> > > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID,
> TRUNCATE_VALUE, USER,
> > > > > > >> H2VERSION
> > > > > > >>
> > > > > > >> Also an issue was created for review current documentation:
> > > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > > >>
> > > > > > >> --
> > > > > > >> BR,
> > > > > > >> Serge
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.
> com/
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov-2
In reply to this post by Valentin Kulichenko
Hello, Valentin.

> When you're talking about join optimization, what exactly are you referring to?

I'm referring to my PR [1]
Currently, it contains transformation from Spark joins to Ignite joins [2]

But, if I understand Vladimir answer right, for now, we don't *fully* support SQL join queries.

Sometimes it will work just right, in other cases, it will throw an exception due Ignite internal implementation.

Please, see my example [3].
Query from line 4 will throw an exception.
The same query from line 10 will succeed, because of index creation.

Both of them syntactically correct.

> Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> In some unsupported scenarios we throw an exception.
> In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> This would require writing a lot of unit tests to cover various scenarios.
> I think we will have this information in a matter of 1-2 months.

[1] https://github.com/apache/ignite/pull/3397
[2] https://github.com/apache/ignite/pull/3397/files#diff-5a861613530bbce650efa50d553a0e92R227
[3] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b

В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:

> Nikolay,
>
> When you're talking about join optimization, what exactly are you referring to?
>
> Since other parts of data frames integration are already merged, I think it's a good time to resurrect this thread? Does it make sense to review it right now? Or you want to make some more changes?
>
> -Val
>
> On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]> wrote:
> > Hi Nikolay,
> >
> > I am not sure if ticket for DECIMAL column metadata exists. If you haven't find one under "sql" component, please feel free to create it on your own. As far as testing of joins, I think it makes sense to start working on it when we finish ANSI compliance testing which is already in progress.
> >
> > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]> wrote:
> > > Hello, Vladimir.
> > >
> > > Thank you for an answer.
> > >
> > > > Do you mean whether it is possible to read it from table metadata?
> > >
> > > Yes, you are right.
> > > I want to read scale and precision of DECIMAL column from table metadata.
> > >
> > > > This will be fixed at some point in future, but I do not have any dates at the moment.
> > >
> > > Is there ticket for it? I can't find it via jira search
> > >
> > > > at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> > > > In some unsupported scenarios we throw an exception.
> > > > In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> > >
> > > Guys, Val, may be we should exclude join optimization from IGNITE-7077 while we haven't all limitation on the hand?
> > >
> > > > We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> > >
> > > Can I help somehow with this activity?
> > >
> > >
> > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > Hi Nikolay,
> > > >
> > > > Could you please clarify your question about scale and precision? Do you mean whether it is possible to read it from table metadata? If yes, it is not possible at the moment unfortunately - we do not store information about lengths, scales and precision, only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in future, but I do not have any dates at the moment.
> > > >
> > > > Now about joins - Denis, I think you provided wrong link to our internal GridGain docs where we accumulate information about ANSI compatibility and which will are going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed about. The question was about limitation of our joins which has nothing to do with ANSI standard. Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception. In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations. This would require writing a lot of unit tests to cover various scenarios. I think we will have this information in a matter of 1-2 months.
> > > >
> > > > Vladimir.
> > > >
> > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > > > > Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
> > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > >
> > > > > That’s on my radar. I’ll take care of this.
> > > > >
> > > > > —
> > > > > Denis
> > > > >
> > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
> > > > > >
> > > > > > I think we need a page listing the unsupported functions with explanation
> > > > > > why, which is either it does not make sense in Ignite or is planned in
> > > > > > future release.
> > > > > >
> > > > > > Sergey, do you think you will be able to do it?
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > >>
> > > > > >> I've updated the documentation for next system functions:
> > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > >>
> > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > >>
> > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> > > > > >> H2VERSION
> > > > > >>
> > > > > >> Also an issue was created for review current documentation:
> > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > >>
> > > > > >> --
> > > > > >> BR,
> > > > > >> Serge
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > >>
> > > > >
> > > >
> > > >
>
>

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Valentin Kulichenko
Nikolay,

This doesn't make sense to me. Not having an index should not cause query
to fail. What is the exception?

-Val

On Tue, Feb 13, 2018 at 8:07 AM, Nikolay Izhikov <[hidden email]>
wrote:

> Hello, Valentin.
>
> > When you're talking about join optimization, what exactly are you
> referring to?
>
> I'm referring to my PR [1]
> Currently, it contains transformation from Spark joins to Ignite joins [2]
>
> But, if I understand Vladimir answer right, for now, we don't *fully*
> support SQL join queries.
>
> Sometimes it will work just right, in other cases, it will throw an
> exception due Ignite internal implementation.
>
> Please, see my example [3].
> Query from line 4 will throw an exception.
> The same query from line 10 will succeed, because of index creation.
>
> Both of them syntactically correct.
>
> > Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2.
> > In some unsupported scenarios we throw an exception.
> > In other cases we return incorrect results silently (e.g. if you do not
> co-locate data and forgot to set "distributed joins" flag).
> > We have a plan to perform excessive testing of joins (both co-located
> and distributed) and list all known limitations.
> > This would require writing a lot of unit tests to cover various
> scenarios.
> > I think we will have this information in a matter of 1-2 months.
>
> [1] https://github.com/apache/ignite/pull/3397
> [2] https://github.com/apache/ignite/pull/3397/files#diff-
> 5a861613530bbce650efa50d553a0e92R227
> [3] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
>
> В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> > Nikolay,
> >
> > When you're talking about join optimization, what exactly are you
> referring to?
> >
> > Since other parts of data frames integration are already merged, I think
> it's a good time to resurrect this thread? Does it make sense to review it
> right now? Or you want to make some more changes?
> >
> > -Val
> >
> > On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]>
> wrote:
> > > Hi Nikolay,
> > >
> > > I am not sure if ticket for DECIMAL column metadata exists. If you
> haven't find one under "sql" component, please feel free to create it on
> your own. As far as testing of joins, I think it makes sense to start
> working on it when we finish ANSI compliance testing which is already in
> progress.
> > >
> > > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <
> [hidden email]> wrote:
> > > > Hello, Vladimir.
> > > >
> > > > Thank you for an answer.
> > > >
> > > > > Do you mean whether it is possible to read it from table metadata?
> > > >
> > > > Yes, you are right.
> > > > I want to read scale and precision of DECIMAL column from table
> metadata.
> > > >
> > > > > This will be fixed at some point in future, but I do not have any
> dates at the moment.
> > > >
> > > > Is there ticket for it? I can't find it via jira search
> > > >
> > > > > at this moment we do not have complete list of all restrictions on
> our joins, because a lot of work is delegated to H2.
> > > > > In some unsupported scenarios we throw an exception.
> > > > > In other cases we return incorrect results silently (e.g. if you
> do not co-locate data and forgot to set "distributed joins" flag).
> > > >
> > > > Guys, Val, may be we should exclude join optimization from
> IGNITE-7077 while we haven't all limitation on the hand?
> > > >
> > > > > We have a plan to perform excessive testing of joins (both
> co-located and distributed) and list all known limitations.
> > > >
> > > > Can I help somehow with this activity?
> > > >
> > > >
> > > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > > Hi Nikolay,
> > > > >
> > > > > Could you please clarify your question about scale and precision?
> Do you mean whether it is possible to read it from table metadata? If yes,
> it is not possible at the moment unfortunately - we do not store
> information about lengths, scales and precision, only actual data types are
> passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some
> point in future, but I do not have any dates at the moment.
> > > > >
> > > > > Now about joins - Denis, I think you provided wrong link to our
> internal GridGain docs where we accumulate information about ANSI
> compatibility and which will are going to publish on Ignite WIKI when it is
> ready. In any case, this is not what Nikolay aksed about. The question was
> about limitation of our joins which has nothing to do with ANSI standard.
> Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2. In
> some unsupported scenarios we throw an exception. In other cases we return
> incorrect results silently (e.g. if you do not co-locate data and forgot to
> set "distributed joins" flag). We have a plan to perform excessive testing
> of joins (both co-located and distributed) and list all known limitations.
> This would require writing a lot of unit tests to cover various scenarios.
> I think we will have this information in a matter of 1-2 months.
> > > > >
> > > > > Vladimir.
> > > > >
> > > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]>
> wrote:
> > > > > > Agree. The unsupported functions should be mentioned on the page
> that will cover Ignite ANSI-99 compliance. We have first results available
> for CORE features of the specification:
> > > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/
> 45093646/ANSI+SQL+99 <https://ggsystems.atlassian.
> net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > > >
> > > > > > That’s on my radar. I’ll take care of this.
> > > > > >
> > > > > > —
> > > > > > Denis
> > > > > >
> > > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
> [hidden email]> wrote:
> > > > > > >
> > > > > > > I think we need a page listing the unsupported functions with
> explanation
> > > > > > > why, which is either it does not make sense in Ignite or is
> planned in
> > > > > > > future release.
> > > > > > >
> > > > > > > Sergey, do you think you will be able to do it?
> > > > > > >
> > > > > > > D.
> > > > > > >
> > > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> [hidden email]>
> > > > > > > wrote:
> > > > > > >
> > > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > > >>
> > > > > > >> I've updated the documentation for next system functions:
> > > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > > >>
> > > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > > >>
> > > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE,
> DATABASE,
> > > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE,
> LINK_SCHEMA,
> > > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY,
> CURRVAL,
> > > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID,
> TRUNCATE_VALUE, USER,
> > > > > > >> H2VERSION
> > > > > > >>
> > > > > > >> Also an issue was created for review current documentation:
> > > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > > >>
> > > > > > >> --
> > > > > > >> BR,
> > > > > > >> Serge
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.
> com/
> > > > > > >>
> > > > > >
> > > > >
> > > > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Nikolay Izhikov-2
Val,

Source code check: https://github.com/apache/ignite/blob/master/modules/indexing/src/main/java/org/apache/ignite/internal/processors/query/h2/opt/GridH2CollocationModel.java#L382

Stack trace:

javax.cache.CacheException: Failed to prepare distributed join query: join condition does not use index [joinedCache=SQL_PUBLIC_JT2, plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.VAL1 AS __C0_1,
    __Z1.ID AS __C0_2,
    __Z1.VAL2 AS __C0_3
FROM PUBLIC.JT1 __Z0
    /* PUBLIC.JT1.__SCAN_ */
INNER JOIN PUBLIC.JT2 __Z1
    /* batched:broadcast PUBLIC.JT2.__SCAN_ */
    ON 1=1
WHERE __Z0.VAL1 = __Z1.VAL2]
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.joinedWithCollocated(GridH2CollocationModel.java:384)
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.calculate(GridH2CollocationModel.java:308)
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.type(GridH2CollocationModel.java:549)
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.calculate(GridH2CollocationModel.java:257)
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.type(GridH2CollocationModel.java:549)
        at org.apache.ignite.internal.processors.query.h2.opt.GridH2CollocationModel.isCollocated(GridH2CollocationModel.java:691)
        at org.apache.ignite.internal.processors.query.h2.sql.GridSqlQuerySplitter.split(GridSqlQuerySplitter.java:239)
        at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.split(IgniteH2Indexing.java:1856)
        at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.parseAndSplit(IgniteH2Indexing.java:1818)
        at org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.querySqlFields(IgniteH2Indexing.java:1569)
        at org.apache.ignite.internal.processors.query.GridQueryProcessor$4.applyx(GridQueryProcessor.java:2037)
        at org.apache.ignite.internal.processors.query.GridQueryProcessor$4.applyx(GridQueryProcessor.java:2032)
        at org.apache.ignite.internal.util.lang.IgniteOutClosureX.apply(IgniteOutClosureX.java:36)
        at org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:2553)
        at org.apache.ignite.internal.processors.query.GridQueryProcessor.querySqlFields(GridQueryProcessor.java:2046)
        at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:664)
        at org.apache.ignite.internal.processors.cache.IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:615)
        at org.apache.ignite.internal.processors.cache.GatewayProtectedCacheProxy.query(GatewayProtectedCacheProxy.java:382)
        at org.apache.ignite.spark.JoinTestSpec.execSQL(JoinTestSpec.scala:63)
       

В Вт, 13/02/2018 в 08:12 -0800, Valentin Kulichenko пишет:

> Nikolay,
>
> This doesn't make sense to me. Not having an index should not cause query to fail. What is the exception?
>
> -Val
>
> On Tue, Feb 13, 2018 at 8:07 AM, Nikolay Izhikov <[hidden email]> wrote:
> > Hello, Valentin.
> >
> > > When you're talking about join optimization, what exactly are you referring to?
> >
> > I'm referring to my PR [1]
> > Currently, it contains transformation from Spark joins to Ignite joins [2]
> >
> > But, if I understand Vladimir answer right, for now, we don't *fully* support SQL join queries.
> >
> > Sometimes it will work just right, in other cases, it will throw an exception due Ignite internal implementation.
> >
> > Please, see my example [3].
> > Query from line 4 will throw an exception.
> > The same query from line 10 will succeed, because of index creation.
> >
> > Both of them syntactically correct.
> >
> > > Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> > > In some unsupported scenarios we throw an exception.
> > > In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> > > We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> > > This would require writing a lot of unit tests to cover various scenarios.
> > > I think we will have this information in a matter of 1-2 months.
> >
> > [1] https://github.com/apache/ignite/pull/3397
> > [2] https://github.com/apache/ignite/pull/3397/files#diff-5a861613530bbce650efa50d553a0e92R227
> > [3] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
> >
> > В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> > > Nikolay,
> > >
> > > When you're talking about join optimization, what exactly are you referring to?
> > >
> > > Since other parts of data frames integration are already merged, I think it's a good time to resurrect this thread? Does it make sense to review it right now? Or you want to make some more changes?
> > >
> > > -Val
> > >
> > > On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <[hidden email]> wrote:
> > > > Hi Nikolay,
> > > >
> > > > I am not sure if ticket for DECIMAL column metadata exists. If you haven't find one under "sql" component, please feel free to create it on your own. As far as testing of joins, I think it makes sense to start working on it when we finish ANSI compliance testing which is already in progress.
> > > >
> > > > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <[hidden email]> wrote:
> > > > > Hello, Vladimir.
> > > > >
> > > > > Thank you for an answer.
> > > > >
> > > > > > Do you mean whether it is possible to read it from table metadata?
> > > > >
> > > > > Yes, you are right.
> > > > > I want to read scale and precision of DECIMAL column from table metadata.
> > > > >
> > > > > > This will be fixed at some point in future, but I do not have any dates at the moment.
> > > > >
> > > > > Is there ticket for it? I can't find it via jira search
> > > > >
> > > > > > at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2.
> > > > > > In some unsupported scenarios we throw an exception.
> > > > > > In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag).
> > > > >
> > > > > Guys, Val, may be we should exclude join optimization from IGNITE-7077 while we haven't all limitation on the hand?
> > > > >
> > > > > > We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations.
> > > > >
> > > > > Can I help somehow with this activity?
> > > > >
> > > > >
> > > > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > > > Hi Nikolay,
> > > > > >
> > > > > > Could you please clarify your question about scale and precision? Do you mean whether it is possible to read it from table metadata? If yes, it is not possible at the moment unfortunately - we do not store information about lengths, scales and precision, only actual data types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed at some point in future, but I do not have any dates at the moment.
> > > > > >
> > > > > > Now about joins - Denis, I think you provided wrong link to our internal GridGain docs where we accumulate information about ANSI compatibility and which will are going to publish on Ignite WIKI when it is ready. In any case, this is not what Nikolay aksed about. The question was about limitation of our joins which has nothing to do with ANSI standard. Unfortunately, at this moment we do not have complete list of all restrictions on our joins, because a lot of work is delegated to H2. In some unsupported scenarios we throw an exception. In other cases we return incorrect results silently (e.g. if you do not co-locate data and forgot to set "distributed joins" flag). We have a plan to perform excessive testing of joins (both co-located and distributed) and list all known limitations. This would require writing a lot of unit tests to cover various scenarios. I think we will have this information in a matter of 1-2 months.
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <[hidden email]> wrote:
> > > > > > > Agree. The unsupported functions should be mentioned on the page that will cover Ignite ANSI-99 compliance. We have first results available for CORE features of the specification:
> > > > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99 <https://ggsystems.atlassian.net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > > > >
> > > > > > > That’s on my radar. I’ll take care of this.
> > > > > > >
> > > > > > > —
> > > > > > > Denis
> > > > > > >
> > > > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <[hidden email]> wrote:
> > > > > > > >
> > > > > > > > I think we need a page listing the unsupported functions with explanation
> > > > > > > > why, which is either it does not make sense in Ignite or is planned in
> > > > > > > > future release.
> > > > > > > >
> > > > > > > > Sergey, do you think you will be able to do it?
> > > > > > > >
> > > > > > > > D.
> > > > > > > >
> > > > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <[hidden email]>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > > > >>
> > > > > > > >> I've updated the documentation for next system functions:
> > > > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > > > >>
> > > > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > > > >>
> > > > > > > >> And for my mind, next functions aren't applicable for Ignite:
> > > > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD, CSVWRITE, DATABASE,
> > > > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE, LINK_SCHEMA,
> > > > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT, READONLY, CURRVAL,
> > > > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM, SCHEMA,
> > > > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID, TRUNCATE_VALUE, USER,
> > > > > > > >> H2VERSION
> > > > > > > >>
> > > > > > > >> Also an issue was created for review current documentation:
> > > > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> BR,
> > > > > > > >> Serge
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > >
> > >
>
>

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [SparkDataFrame] Query Optimization. Prototype

Valentin Kulichenko
Nikolay,

Looks like this is because you enabled non-collocated joins. I was not
aware of this limitation though, do we have this documented somewhere?

-Val

On Tue, Feb 13, 2018 at 8:21 AM, Nikolay Izhikov <[hidden email]>
wrote:

> Val,
>
> Source code check: https://github.com/apache/ignite/blob/master/modules/
> indexing/src/main/java/org/apache/ignite/internal/processors/query/h2/opt/
> GridH2CollocationModel.java#L382
>
> Stack trace:
>
> javax.cache.CacheException: Failed to prepare distributed join query: join
> condition does not use index [joinedCache=SQL_PUBLIC_JT2, plan=SELECT
>     __Z0.ID AS __C0_0,
>     __Z0.VAL1 AS __C0_1,
>     __Z1.ID AS __C0_2,
>     __Z1.VAL2 AS __C0_3
> FROM PUBLIC.JT1 __Z0
>     /* PUBLIC.JT1.__SCAN_ */
> INNER JOIN PUBLIC.JT2 __Z1
>     /* batched:broadcast PUBLIC.JT2.__SCAN_ */
>     ON 1=1
> WHERE __Z0.VAL1 = __Z1.VAL2]
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.joinedWithCollocated(GridH2CollocationModel.java:
> 384)
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.calculate(GridH2CollocationModel.java:308)
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.type(GridH2CollocationModel.java:549)
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.calculate(GridH2CollocationModel.java:257)
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.type(GridH2CollocationModel.java:549)
>         at org.apache.ignite.internal.processors.query.h2.opt.
> GridH2CollocationModel.isCollocated(GridH2CollocationModel.java:691)
>         at org.apache.ignite.internal.processors.query.h2.sql.
> GridSqlQuerySplitter.split(GridSqlQuerySplitter.java:239)
>         at org.apache.ignite.internal.processors.query.h2.
> IgniteH2Indexing.split(IgniteH2Indexing.java:1856)
>         at org.apache.ignite.internal.processors.query.h2.
> IgniteH2Indexing.parseAndSplit(IgniteH2Indexing.java:1818)
>         at org.apache.ignite.internal.processors.query.h2.
> IgniteH2Indexing.querySqlFields(IgniteH2Indexing.java:1569)
>         at org.apache.ignite.internal.processors.query.
> GridQueryProcessor$4.applyx(GridQueryProcessor.java:2037)
>         at org.apache.ignite.internal.processors.query.
> GridQueryProcessor$4.applyx(GridQueryProcessor.java:2032)
>         at org.apache.ignite.internal.util.lang.IgniteOutClosureX.
> apply(IgniteOutClosureX.java:36)
>         at org.apache.ignite.internal.processors.query.GridQueryProcessor.
> executeQuery(GridQueryProcessor.java:2553)
>         at org.apache.ignite.internal.processors.query.GridQueryProcessor.
> querySqlFields(GridQueryProcessor.java:2046)
>         at org.apache.ignite.internal.processors.cache.
> IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:664)
>         at org.apache.ignite.internal.processors.cache.
> IgniteCacheProxyImpl.query(IgniteCacheProxyImpl.java:615)
>         at org.apache.ignite.internal.processors.cache.
> GatewayProtectedCacheProxy.query(GatewayProtectedCacheProxy.java:382)
>         at org.apache.ignite.spark.JoinTestSpec.execSQL(
> JoinTestSpec.scala:63)
>
>
> В Вт, 13/02/2018 в 08:12 -0800, Valentin Kulichenko пишет:
> > Nikolay,
> >
> > This doesn't make sense to me. Not having an index should not cause
> query to fail. What is the exception?
> >
> > -Val
> >
> > On Tue, Feb 13, 2018 at 8:07 AM, Nikolay Izhikov <[hidden email]>
> wrote:
> > > Hello, Valentin.
> > >
> > > > When you're talking about join optimization, what exactly are you
> referring to?
> > >
> > > I'm referring to my PR [1]
> > > Currently, it contains transformation from Spark joins to Ignite joins
> [2]
> > >
> > > But, if I understand Vladimir answer right, for now, we don't *fully*
> support SQL join queries.
> > >
> > > Sometimes it will work just right, in other cases, it will throw an
> exception due Ignite internal implementation.
> > >
> > > Please, see my example [3].
> > > Query from line 4 will throw an exception.
> > > The same query from line 10 will succeed, because of index creation.
> > >
> > > Both of them syntactically correct.
> > >
> > > > Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2.
> > > > In some unsupported scenarios we throw an exception.
> > > > In other cases we return incorrect results silently (e.g. if you do
> not co-locate data and forgot to set "distributed joins" flag).
> > > > We have a plan to perform excessive testing of joins (both
> co-located and distributed) and list all known limitations.
> > > > This would require writing a lot of unit tests to cover various
> scenarios.
> > > > I think we will have this information in a matter of 1-2 months.
> > >
> > > [1] https://github.com/apache/ignite/pull/3397
> > > [2] https://github.com/apache/ignite/pull/3397/files#diff-
> 5a861613530bbce650efa50d553a0e92R227
> > > [3] https://gist.github.com/nizhikov/a4389fd78636869dd38c13920b5baf2b
> > >
> > > В Пн, 12/02/2018 в 13:45 -0800, Valentin Kulichenko пишет:
> > > > Nikolay,
> > > >
> > > > When you're talking about join optimization, what exactly are you
> referring to?
> > > >
> > > > Since other parts of data frames integration are already merged, I
> think it's a good time to resurrect this thread? Does it make sense to
> review it right now? Or you want to make some more changes?
> > > >
> > > > -Val
> > > >
> > > > On Mon, Feb 12, 2018 at 12:20 AM, Vladimir Ozerov <
> [hidden email]> wrote:
> > > > > Hi Nikolay,
> > > > >
> > > > > I am not sure if ticket for DECIMAL column metadata exists. If you
> haven't find one under "sql" component, please feel free to create it on
> your own. As far as testing of joins, I think it makes sense to start
> working on it when we finish ANSI compliance testing which is already in
> progress.
> > > > >
> > > > > On Wed, Jan 24, 2018 at 12:27 PM, Nikolay Izhikov <
> [hidden email]> wrote:
> > > > > > Hello, Vladimir.
> > > > > >
> > > > > > Thank you for an answer.
> > > > > >
> > > > > > > Do you mean whether it is possible to read it from table
> metadata?
> > > > > >
> > > > > > Yes, you are right.
> > > > > > I want to read scale and precision of DECIMAL column from table
> metadata.
> > > > > >
> > > > > > > This will be fixed at some point in future, but I do not have
> any dates at the moment.
> > > > > >
> > > > > > Is there ticket for it? I can't find it via jira search
> > > > > >
> > > > > > > at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2.
> > > > > > > In some unsupported scenarios we throw an exception.
> > > > > > > In other cases we return incorrect results silently (e.g. if
> you do not co-locate data and forgot to set "distributed joins" flag).
> > > > > >
> > > > > > Guys, Val, may be we should exclude join optimization from
> IGNITE-7077 while we haven't all limitation on the hand?
> > > > > >
> > > > > > > We have a plan to perform excessive testing of joins (both
> co-located and distributed) and list all known limitations.
> > > > > >
> > > > > > Can I help somehow with this activity?
> > > > > >
> > > > > >
> > > > > > В Ср, 24/01/2018 в 12:08 +0300, Vladimir Ozerov пишет:
> > > > > > > Hi Nikolay,
> > > > > > >
> > > > > > > Could you please clarify your question about scale and
> precision? Do you mean whether it is possible to read it from table
> metadata? If yes, it is not possible at the moment unfortunately - we do
> not store information about lengths, scales and precision, only actual data
> types are passed to H2 (e.g. String, BigDecimal, etc.). This will be fixed
> at some point in future, but I do not have any dates at the moment.
> > > > > > >
> > > > > > > Now about joins - Denis, I think you provided wrong link to
> our internal GridGain docs where we accumulate information about ANSI
> compatibility and which will are going to publish on Ignite WIKI when it is
> ready. In any case, this is not what Nikolay aksed about. The question was
> about limitation of our joins which has nothing to do with ANSI standard.
> Unfortunately, at this moment we do not have complete list of all
> restrictions on our joins, because a lot of work is delegated to H2. In
> some unsupported scenarios we throw an exception. In other cases we return
> incorrect results silently (e.g. if you do not co-locate data and forgot to
> set "distributed joins" flag). We have a plan to perform excessive testing
> of joins (both co-located and distributed) and list all known limitations.
> This would require writing a lot of unit tests to cover various scenarios.
> I think we will have this information in a matter of 1-2 months.
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > > > On Tue, Jan 23, 2018 at 11:45 PM, Denis Magda <
> [hidden email]> wrote:
> > > > > > > > Agree. The unsupported functions should be mentioned on the
> page that will cover Ignite ANSI-99 compliance. We have first results
> available for CORE features of the specification:
> > > > > > > > https://ggsystems.atlassian.net/wiki/spaces/GG/pages/
> 45093646/ANSI+SQL+99 <https://ggsystems.atlassian.
> net/wiki/spaces/GG/pages/45093646/ANSI+SQL+99>
> > > > > > > >
> > > > > > > > That’s on my radar. I’ll take care of this.
> > > > > > > >
> > > > > > > > —
> > > > > > > > Denis
> > > > > > > >
> > > > > > > > > On Jan 23, 2018, at 10:31 AM, Dmitriy Setrakyan <
> [hidden email]> wrote:
> > > > > > > > >
> > > > > > > > > I think we need a page listing the unsupported functions
> with explanation
> > > > > > > > > why, which is either it does not make sense in Ignite or
> is planned in
> > > > > > > > > future release.
> > > > > > > > >
> > > > > > > > > Sergey, do you think you will be able to do it?
> > > > > > > > >
> > > > > > > > > D.
> > > > > > > > >
> > > > > > > > > On Tue, Jan 23, 2018 at 12:05 AM, Serge Puchnin <
> [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> yes, the Cust function is supporting both Ignite and H2.
> > > > > > > > >>
> > > > > > > > >> I've updated the documentation for next system functions:
> > > > > > > > >> CASEWHEN Function, CAST, CONVERT, TABLE
> > > > > > > > >>
> > > > > > > > >> https://apacheignite-sql.readme.io/docs/system-functions
> > > > > > > > >>
> > > > > > > > >> And for my mind, next functions aren't applicable for
> Ignite:
> > > > > > > > >> ARRAY_GET, ARRAY_LENGTH, ARRAY_CONTAINS, CSVREAD,
> CSVWRITE, DATABASE,
> > > > > > > > >> DATABASE_PATH, DISK_SPACE_USED, FILE_READ, FILE_WRITE,
> LINK_SCHEMA,
> > > > > > > > >> MEMORY_FREE, MEMORY_USED, LOCK_MODE, LOCK_TIMEOUT,
> READONLY, CURRVAL,
> > > > > > > > >> AUTOCOMMIT, CANCEL_SESSION, IDENTITY, NEXTVAL, ROWNUM,
> SCHEMA,
> > > > > > > > >> SCOPE_IDENTITY, SESSION_ID, SET, TRANSACTION_ID,
> TRUNCATE_VALUE, USER,
> > > > > > > > >> H2VERSION
> > > > > > > > >>
> > > > > > > > >> Also an issue was created for review current
> documentation:
> > > > > > > > >> https://issues.apache.org/jira/browse/IGNITE-7496
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> BR,
> > > > > > > > >> Serge
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> Sent from: http://apache-ignite-
> developers.2346864.n4.nabble.com/
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > >
> > > >
> > > >
> >
> >
>
12