IgniteDataFrame SparkSQL OR clause return incorrect result

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

IgniteDataFrame SparkSQL OR clause return incorrect result

alexcwyu
I am using IgniteDataFrame and using Spark SQL to query the dataframe.
    spark: 2.3.2
    ignite: 2.7.0

I found a bug in SparkSQL while using Ignite.

    select count(*) from risk where val_date = '2019-04-26' and portf_id =
27315
    -- correctly return 11 row

    select count(*) from risk where val_date = '2019-04-26' and portf_id =
27315 or portf_id = 14041
    -- correctly return 494 row

    select count(*) from risk where val_date = '2019-04-26' and (portf_id =
27315 or portf_id = 14041)
    -- expected to return 505 row but it return >7000 row

If I turnoff ignite, the row count with OR clause is correct.

anything I can do to further debug / pinpoint the issue?



--
Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: IgniteDataFrame SparkSQL OR clause return incorrect result

Nikolay Izhikov-2
Hello, alex.

Thanks for reporting this.

1. You can file a bug in Ignite Jira.
2. It would be even better if you write a simple self contained reproducer
for your problem.
3. If you turnoff Ignite query optimization issue still reproducible?

пн, 29 апр. 2019 г., 19:29 alexcwyu <[hidden email]>:

> I am using IgniteDataFrame and using Spark SQL to query the dataframe.
>     spark: 2.3.2
>     ignite: 2.7.0
>
> I found a bug in SparkSQL while using Ignite.
>
>     select count(*) from risk where val_date = '2019-04-26' and portf_id =
> 27315
>     -- correctly return 11 row
>
>     select count(*) from risk where val_date = '2019-04-26' and portf_id =
> 27315 or portf_id = 14041
>     -- correctly return 494 row
>
>     select count(*) from risk where val_date = '2019-04-26' and (portf_id =
> 27315 or portf_id = 14041)
>     -- expected to return 505 row but it return >7000 row
>
> If I turnoff ignite, the row count with OR clause is correct.
>
> anything I can do to further debug / pinpoint the issue?
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>