Allow distributed SQL query execution over explicit set of partitions

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

Alexei Scherbakov
Dmitriy,

ScanQueries currently support only one partition. I will extend it to
support multiple partitions.

For distributed joins partitions will only be applied on "map" query step.

2017-01-21 21:19 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> On Sat, Jan 21, 2017 at 1:53 AM, Alexei Scherbakov <
> [hidden email]> wrote:
>
> > >> > > > 5. I have the same understanding. Distributed joins will ignore
> > the
> > >> > > > setting.
> > >> > > > This is not implemented yet..
> > >> > > >
> > >> > >
> > >> > > And again, this will be very confusing to users. Any chance we can
> > >> throw
> > >> > an
> > >> > > exception with a proper error message here?
> > >> > >
> > >> >
> > >> > I hope to make it working too. But first I need a review of current
> PR
> > >> > state to understand whether I'm moving in right direction or not.
> > >> >
> > >>
> > >> What behavior are you proposing to implement?
> > >>
> > >> Alexey, I have noticed in your comments that you are adding this
> support
> > >> only for the SQL queries. Why not make it consistent across all the
> > >> queries?
> > >>
> > >
> > > Initially I had no such intentions, because I do not use other query
> > types.
> > > But if the community has the need of this, why not.
> > > I'll start working on it next week.
> > >
> >
>
> Alexey, I think ScanQueries already had this support. If you are
> deprecating the old methods, then you need to move this logic to the new
> methods.
>
> Also, what behavior are you implementing for the distributed joins?
>
> D.
>



--

Best regards,
Alexei Scherbakov
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

dsetrakyan
On Sun, Jan 22, 2017 at 4:46 AM, Alexei Scherbakov <
[hidden email]> wrote:

> Dmitriy,
>
> ScanQueries currently support only one partition. I will extend it to
> support multiple partitions.
>
> For distributed joins partitions will only be applied on "map" query step.
>

Will it still be possible to get data from the partitions that were not
specified?
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

Alexei Scherbakov
Yes, it will be possible because distributed joins are executed using
broadcast queries.

2017-01-22 15:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> On Sun, Jan 22, 2017 at 4:46 AM, Alexei Scherbakov <
> [hidden email]> wrote:
>
> > Dmitriy,
> >
> > ScanQueries currently support only one partition. I will extend it to
> > support multiple partitions.
> >
> > For distributed joins partitions will only be applied on "map" query
> step.
> >
>
> Will it still be possible to get data from the partitions that were not
> specified?
>



--

Best regards,
Alexei Scherbakov
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

dsetrakyan
On Sun, Jan 22, 2017 at 5:06 AM, Alexei Scherbakov <
[hidden email]> wrote:

> Yes, it will be possible because distributed joins are executed using
> broadcast queries.
>
>
In this case why even bother supporting non-collocated joins? We need to
throw an exception in this case.

2017-01-22 15:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

>
> > On Sun, Jan 22, 2017 at 4:46 AM, Alexei Scherbakov <
> > [hidden email]> wrote:
> >
> > > Dmitriy,
> > >
> > > ScanQueries currently support only one partition. I will extend it to
> > > support multiple partitions.
> > >
> > > For distributed joins partitions will only be applied on "map" query
> > step.
> > >
> >
> > Will it still be possible to get data from the partitions that were not
> > specified?
> >
>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

Alexei Scherbakov
Dmitriy,

This still can make sense for some scenarios, because we could limit number
of initial map requests reducing overall query overhead.

Are you still sure we need to throw an exception ?

2017-01-23 1:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> On Sun, Jan 22, 2017 at 5:06 AM, Alexei Scherbakov <
> [hidden email]> wrote:
>
> > Yes, it will be possible because distributed joins are executed using
> > broadcast queries.
> >
> >
> In this case why even bother supporting non-collocated joins? We need to
> throw an exception in this case.
>
> 2017-01-22 15:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >
> > > On Sun, Jan 22, 2017 at 4:46 AM, Alexei Scherbakov <
> > > [hidden email]> wrote:
> > >
> > > > Dmitriy,
> > > >
> > > > ScanQueries currently support only one partition. I will extend it to
> > > > support multiple partitions.
> > > >
> > > > For distributed joins partitions will only be applied on "map" query
> > > step.
> > > >
> > >
> > > Will it still be possible to get data from the partitions that were not
> > > specified?
> > >
> >
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>



--

Best regards,
Alexei Scherbakov
Reply | Threaded
Open this post in threaded view
|

Re: Allow distributed SQL query execution over explicit set of partitions

dsetrakyan
On Mon, Jan 23, 2017 at 11:16 AM, Alexei Scherbakov <
[hidden email]> wrote:

> Dmitriy,
>
> This still can make sense for some scenarios, because we could limit number
> of initial map requests reducing overall query overhead.
>
> Are you still sure we need to throw an exception ?
>

The outcome and the resulting behavior needs to be absolutely clear to our
users. If we can't provide any sort of guarantee here, I would disallow it
altogether.


>
> 2017-01-23 1:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
> > On Sun, Jan 22, 2017 at 5:06 AM, Alexei Scherbakov <
> > [hidden email]> wrote:
> >
> > > Yes, it will be possible because distributed joins are executed using
> > > broadcast queries.
> > >
> > >
> > In this case why even bother supporting non-collocated joins? We need to
> > throw an exception in this case.
> >
> > 2017-01-22 15:49 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > On Sun, Jan 22, 2017 at 4:46 AM, Alexei Scherbakov <
> > > > [hidden email]> wrote:
> > > >
> > > > > Dmitriy,
> > > > >
> > > > > ScanQueries currently support only one partition. I will extend it
> to
> > > > > support multiple partitions.
> > > > >
> > > > > For distributed joins partitions will only be applied on "map"
> query
> > > > step.
> > > > >
> > > >
> > > > Will it still be possible to get data from the partitions that were
> not
> > > > specified?
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > Best regards,
> > > Alexei Scherbakov
> > >
> >
>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>
12