Additional SQL metrics

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Additional SQL metrics

dmagda
Igniters,

Let’s shed more light on SQL query execution internals introducing a set of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).

Per-query metrics. Total history size is defined by *CacheConfiguration.getQueryDetailMetricsSize*:
* if a query was executed in the collocated or non-collocated mode. Three results are valid: collocated, non-collocated, simple query (no joins).
* non-collocated query: size of the data exchanged between the nodes to complete a join.
* non-collocated query: did a query do broadcast or unicast to get data needed to complete a join.
* non-collocated and collocated query: a part of the time spent joining the data.

CacheMetrics:
* an average number of executed SQL queries (collocated, non-collocated, simple query (no joins)).

Please don’t hesitate do share suggest another metrics or improve proposed ones.


Denis
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
BTW,

What if we expose per-query metrics below as a part of EXPLAIN ANALYZE? Sergi, is this feasible?


Denis

> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
>
> Igniters,
>
> Let’s shed more light on SQL query execution internals introducing a set of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
>
> Per-query metrics. Total history size is defined by *CacheConfiguration.getQueryDetailMetricsSize*:
> * if a query was executed in the collocated or non-collocated mode. Three results are valid: collocated, non-collocated, simple query (no joins).
> * non-collocated query: size of the data exchanged between the nodes to complete a join.
> * non-collocated query: did a query do broadcast or unicast to get data needed to complete a join.
> * non-collocated and collocated query: a part of the time spent joining the data.
>
> CacheMetrics:
> * an average number of executed SQL queries (collocated, non-collocated, simple query (no joins)).
>
> Please don’t hesitate do share suggest another metrics or improve proposed ones.
>
> —
> Denis

Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

Vladimir Ozerov
Denis,

Query execution is complex process involving different stages which are not
very easy to match with each other. Especially provided that any node can
leave topology at any time. Another problem is that engine evolves and
metrics like "did a query do broadcast or unicast" may easily become
useless at some point, because for example there will be neither unicast,
nor broadast, but something different. On the other hand I completely agree
that performance monitoring is essential part of any mature DBMS.

I would start with metrics which are both very basic and easy to implement
at the same time. For example we can add fingerprint (hash) to every query
which will be used to join "map" and "reduce" parts with each other and add
the following basic metrics:
1) Execution count for particular query
2) Number of map nodes - min, max, avg
3) Map step duration (if applicable) - min, max, avg
4) Reduce step duration (if applicable) - min, max, avg

Once done users will be able to get statistics for particular queries.

Vladimir.


On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:

> BTW,
>
> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
> Sergi, is this feasible?
>
> —
> Denis
>
> > On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> >
> > Igniters,
> >
> > Let’s shed more light on SQL query execution internals introducing a set
> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
> >
> > Per-query metrics. Total history size is defined by *CacheConfiguration.
> getQueryDetailMetricsSize*:
> > * if a query was executed in the collocated or non-collocated mode.
> Three results are valid: collocated, non-collocated, simple query (no
> joins).
> > * non-collocated query: size of the data exchanged between the nodes to
> complete a join.
> > * non-collocated query: did a query do broadcast or unicast to get data
> needed to complete a join.
> > * non-collocated and collocated query: a part of the time spent joining
> the data.
> >
> > CacheMetrics:
> > * an average number of executed SQL queries (collocated, non-collocated,
> simple query (no joins)).
> >
> > Please don’t hesitate do share suggest another metrics or improve
> proposed ones.
> >
> > —
> > Denis
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dsetrakyan
I think some of the metrics specified by Denis also make sense, so I would
add them as well. See below...

On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Denis,
>
> Query execution is complex process involving different stages which are not
> very easy to match with each other. Especially provided that any node can
> leave topology at any time. Another problem is that engine evolves and
> metrics like "did a query do broadcast or unicast" may easily become
> useless at some point, because for example there will be neither unicast,
> nor broadast, but something different. On the other hand I completely agree
> that performance monitoring is essential part of any mature DBMS.
>
> I would start with metrics which are both very basic and easy to implement
> at the same time. For example we can add fingerprint (hash) to every query
> which will be used to join "map" and "reduce" parts with each other and add
> the following basic metrics:
> 1) Execution count for particular query
> 2) Number of map nodes - min, max, avg
>

(1) and (2) makes sense


> 3) Map step duration (if applicable) - min, max,

4) Reduce step duration (if applicable) - min, max, avg
>

Not sure if (3) and (4) are needed. I would only add them if they are easy
to implement.

I would also add these:

5) Collocated: yes/no
6) last execution time
7) min/max/average execution duration


>
> Once done users will be able to get statistics for particular queries.
>
> Vladimir.
>
>
> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:
>
> > BTW,
> >
> > What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
> > Sergi, is this feasible?
> >
> > —
> > Denis
> >
> > > On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> > >
> > > Igniters,
> > >
> > > Let’s shed more light on SQL query execution internals introducing a
> set
> > of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
> > >
> > > Per-query metrics. Total history size is defined by
> *CacheConfiguration.
> > getQueryDetailMetricsSize*:
> > > * if a query was executed in the collocated or non-collocated mode.
> > Three results are valid: collocated, non-collocated, simple query (no
> > joins).
> > > * non-collocated query: size of the data exchanged between the nodes to
> > complete a join.
> > > * non-collocated query: did a query do broadcast or unicast to get data
> > needed to complete a join.
> > > * non-collocated and collocated query: a part of the time spent joining
> > the data.
> > >
> > > CacheMetrics:
> > > * an average number of executed SQL queries (collocated,
> non-collocated,
> > simple query (no joins)).
> > >
> > > Please don’t hesitate do share suggest another metrics or improve
> > proposed ones.
> > >
> > > —
> > > Denis
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
Vovan,

Your metrics make perfect sense to me. However, I see a high demand for JOINs based metrics especially from those who give a try to non-collocated joins in production  and want to measure them somehow. This is why, personally, I prefer to see the metrics below in the top priority list as well:

if a query was executed in the collocated or non-collocated mode. Three results are valid: collocated, non-collocated, simple query (no joins).
non-collocated query: size of the data exchanged between the nodes to complete a specific join. If there are multiple joins in the query we need to provide this metric for every of them.
non-collocated and collocated query: a part of the time spent joining the data. If there are multiple joins in the query we need to provide this metric for every of them.

As for “unicast” and “broadcast”, agree, let’s ignore it for now.

In any case, can we include timing information (map phase, reduce phase, join phase) into an execution plan produced by H2? Are there any implementation hooks?


Denis


> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]> wrote:
>
> I think some of the metrics specified by Denis also make sense, so I would
> add them as well. See below...
>
> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email] <mailto:[hidden email]>>
> wrote:
>
>> Denis,
>>
>> Query execution is complex process involving different stages which are not
>> very easy to match with each other. Especially provided that any node can
>> leave topology at any time. Another problem is that engine evolves and
>> metrics like "did a query do broadcast or unicast" may easily become
>> useless at some point, because for example there will be neither unicast,
>> nor broadast, but something different. On the other hand I completely agree
>> that performance monitoring is essential part of any mature DBMS.
>>
>> I would start with metrics which are both very basic and easy to implement
>> at the same time. For example we can add fingerprint (hash) to every query
>> which will be used to join "map" and "reduce" parts with each other and add
>> the following basic metrics:
>> 1) Execution count for particular query
>> 2) Number of map nodes - min, max, avg
>>
>
> (1) and (2) makes sense
>
>
>> 3) Map step duration (if applicable) - min, max,
>
> 4) Reduce step duration (if applicable) - min, max, avg
>>
>
> Not sure if (3) and (4) are needed. I would only add them if they are easy
> to implement.
>
> I would also add these:
>
> 5) Collocated: yes/no
> 6) last execution time
> 7) min/max/average execution duration
>
>
>>
>> Once done users will be able to get statistics for particular queries.
>>
>> Vladimir.
>>
>>
>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:
>>
>>> BTW,
>>>
>>> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
>>> Sergi, is this feasible?
>>>
>>> —
>>> Denis
>>>
>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
>>>>
>>>> Igniters,
>>>>
>>>> Let’s shed more light on SQL query execution internals introducing a
>> set
>>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
>>>>
>>>> Per-query metrics. Total history size is defined by
>> *CacheConfiguration.
>>> getQueryDetailMetricsSize*:
>>>> * if a query was executed in the collocated or non-collocated mode.
>>> Three results are valid: collocated, non-collocated, simple query (no
>>> joins).
>>>> * non-collocated query: size of the data exchanged between the nodes to
>>> complete a join.
>>>> * non-collocated query: did a query do broadcast or unicast to get data
>>> needed to complete a join.
>>>> * non-collocated and collocated query: a part of the time spent joining
>>> the data.
>>>>
>>>> CacheMetrics:
>>>> * an average number of executed SQL queries (collocated,
>> non-collocated,
>>> simple query (no joins)).
>>>>
>>>> Please don’t hesitate do share suggest another metrics or improve
>>> proposed ones.
>>>>
>>>> —
>>>> Denis

Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dsetrakyan
By the way, I am assuming that we are talking about per-query metrics, in
which case we should specify metrics history size, so we don't keep all the
queries in memory forever. I don't think it makes sense to have metrics
aggregated across the queries. Just wanted to clarify this.

On Thu, Mar 2, 2017 at 12:31 PM, Denis Magda <[hidden email]> wrote:

> Vovan,
>
> Your metrics make perfect sense to me. However, I see a high demand for
> JOINs based metrics especially from those who give a try to non-collocated
> joins in production  and want to measure them somehow. This is why,
> personally, I prefer to see the metrics below in the top priority list as
> well:
>
> if a query was executed in the collocated or non-collocated mode. Three
> results are valid: collocated, non-collocated, simple query (no joins).
> non-collocated query: size of the data exchanged between the nodes to
> complete a specific join. If there are multiple joins in the query we need
> to provide this metric for every of them.
> non-collocated and collocated query: a part of the time spent joining the
> data. If there are multiple joins in the query we need to provide this
> metric for every of them.
>
> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>
> In any case, can we include timing information (map phase, reduce phase,
> join phase) into an execution plan produced by H2? Are there any
> implementation hooks?
>
> —
> Denis
>
>
> > On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]>
> wrote:
> >
> > I think some of the metrics specified by Denis also make sense, so I
> would
> > add them as well. See below...
> >
> > On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]
> <mailto:[hidden email]>>
> > wrote:
> >
> >> Denis,
> >>
> >> Query execution is complex process involving different stages which are
> not
> >> very easy to match with each other. Especially provided that any node
> can
> >> leave topology at any time. Another problem is that engine evolves and
> >> metrics like "did a query do broadcast or unicast" may easily become
> >> useless at some point, because for example there will be neither
> unicast,
> >> nor broadast, but something different. On the other hand I completely
> agree
> >> that performance monitoring is essential part of any mature DBMS.
> >>
> >> I would start with metrics which are both very basic and easy to
> implement
> >> at the same time. For example we can add fingerprint (hash) to every
> query
> >> which will be used to join "map" and "reduce" parts with each other and
> add
> >> the following basic metrics:
> >> 1) Execution count for particular query
> >> 2) Number of map nodes - min, max, avg
> >>
> >
> > (1) and (2) makes sense
> >
> >
> >> 3) Map step duration (if applicable) - min, max,
> >
> > 4) Reduce step duration (if applicable) - min, max, avg
> >>
> >
> > Not sure if (3) and (4) are needed. I would only add them if they are
> easy
> > to implement.
> >
> > I would also add these:
> >
> > 5) Collocated: yes/no
> > 6) last execution time
> > 7) min/max/average execution duration
> >
> >
> >>
> >> Once done users will be able to get statistics for particular queries.
> >>
> >> Vladimir.
> >>
> >>
> >> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:
> >>
> >>> BTW,
> >>>
> >>> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
> >>> Sergi, is this feasible?
> >>>
> >>> —
> >>> Denis
> >>>
> >>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> >>>>
> >>>> Igniters,
> >>>>
> >>>> Let’s shed more light on SQL query execution internals introducing a
> >> set
> >>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
> >>>>
> >>>> Per-query metrics. Total history size is defined by
> >> *CacheConfiguration.
> >>> getQueryDetailMetricsSize*:
> >>>> * if a query was executed in the collocated or non-collocated mode.
> >>> Three results are valid: collocated, non-collocated, simple query (no
> >>> joins).
> >>>> * non-collocated query: size of the data exchanged between the nodes
> to
> >>> complete a join.
> >>>> * non-collocated query: did a query do broadcast or unicast to get
> data
> >>> needed to complete a join.
> >>>> * non-collocated and collocated query: a part of the time spent
> joining
> >>> the data.
> >>>>
> >>>> CacheMetrics:
> >>>> * an average number of executed SQL queries (collocated,
> >> non-collocated,
> >>> simple query (no joins)).
> >>>>
> >>>> Please don’t hesitate do share suggest another metrics or improve
> >>> proposed ones.
> >>>>
> >>>> —
> >>>> Denis
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

Vladimir Ozerov
In reply to this post by dmagda
Denis,

The main problem with suggested metrics is that they implies that ceratin
internal mechanics work in predefined way. For example, what is JOIN
metrics? There are no guarantees that JOIN in user's query will be
translated to a real physical join. What if several different query
execution pieces happen in parallel? What if we rework our distributed
query engine from pull to push approach for performance reasons and there
will be no JOINs in classical sense?

This is why I think that we should start with very basic things. Something
like:
1) Query exec count
2) Query exec time (first define what "time" means) - min, max, avg
3) Number of bytes exchanged between nodes during query execution
4) Number of returned rows - min, max, avg

Once we have base numbers in place, we can think of carefully integrating
and enhancing all pieces of query execution into more verbose formats,
similar to query plans with relative weights in classical RDBMS systems.

Thoughts?



On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]> wrote:

> Vovan,
>
> Your metrics make perfect sense to me. However, I see a high demand for
> JOINs based metrics especially from those who give a try to non-collocated
> joins in production  and want to measure them somehow. This is why,
> personally, I prefer to see the metrics below in the top priority list as
> well:
>
> if a query was executed in the collocated or non-collocated mode. Three
> results are valid: collocated, non-collocated, simple query (no joins).
> non-collocated query: size of the data exchanged between the nodes to
> complete a specific join. If there are multiple joins in the query we need
> to provide this metric for every of them.
> non-collocated and collocated query: a part of the time spent joining the
> data. If there are multiple joins in the query we need to provide this
> metric for every of them.
>
> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>
> In any case, can we include timing information (map phase, reduce phase,
> join phase) into an execution plan produced by H2? Are there any
> implementation hooks?
>
> —
> Denis
>
>
> > On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]>
> wrote:
> >
> > I think some of the metrics specified by Denis also make sense, so I
> would
> > add them as well. See below...
> >
> > On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]
> <mailto:[hidden email]>>
> > wrote:
> >
> >> Denis,
> >>
> >> Query execution is complex process involving different stages which are
> not
> >> very easy to match with each other. Especially provided that any node
> can
> >> leave topology at any time. Another problem is that engine evolves and
> >> metrics like "did a query do broadcast or unicast" may easily become
> >> useless at some point, because for example there will be neither
> unicast,
> >> nor broadast, but something different. On the other hand I completely
> agree
> >> that performance monitoring is essential part of any mature DBMS.
> >>
> >> I would start with metrics which are both very basic and easy to
> implement
> >> at the same time. For example we can add fingerprint (hash) to every
> query
> >> which will be used to join "map" and "reduce" parts with each other and
> add
> >> the following basic metrics:
> >> 1) Execution count for particular query
> >> 2) Number of map nodes - min, max, avg
> >>
> >
> > (1) and (2) makes sense
> >
> >
> >> 3) Map step duration (if applicable) - min, max,
> >
> > 4) Reduce step duration (if applicable) - min, max, avg
> >>
> >
> > Not sure if (3) and (4) are needed. I would only add them if they are
> easy
> > to implement.
> >
> > I would also add these:
> >
> > 5) Collocated: yes/no
> > 6) last execution time
> > 7) min/max/average execution duration
> >
> >
> >>
> >> Once done users will be able to get statistics for particular queries.
> >>
> >> Vladimir.
> >>
> >>
> >> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:
> >>
> >>> BTW,
> >>>
> >>> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
> >>> Sergi, is this feasible?
> >>>
> >>> —
> >>> Denis
> >>>
> >>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> >>>>
> >>>> Igniters,
> >>>>
> >>>> Let’s shed more light on SQL query execution internals introducing a
> >> set
> >>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
> >>>>
> >>>> Per-query metrics. Total history size is defined by
> >> *CacheConfiguration.
> >>> getQueryDetailMetricsSize*:
> >>>> * if a query was executed in the collocated or non-collocated mode.
> >>> Three results are valid: collocated, non-collocated, simple query (no
> >>> joins).
> >>>> * non-collocated query: size of the data exchanged between the nodes
> to
> >>> complete a join.
> >>>> * non-collocated query: did a query do broadcast or unicast to get
> data
> >>> needed to complete a join.
> >>>> * non-collocated and collocated query: a part of the time spent
> joining
> >>> the data.
> >>>>
> >>>> CacheMetrics:
> >>>> * an average number of executed SQL queries (collocated,
> >> non-collocated,
> >>> simple query (no joins)).
> >>>>
> >>>> Please don’t hesitate do share suggest another metrics or improve
> >>> proposed ones.
> >>>>
> >>>> —
> >>>> Denis
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dsetrakyan
Vladimir, are you talking about per-query metrics?

On Thu, Mar 2, 2017 at 1:32 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Denis,
>
> The main problem with suggested metrics is that they implies that ceratin
> internal mechanics work in predefined way. For example, what is JOIN
> metrics? There are no guarantees that JOIN in user's query will be
> translated to a real physical join. What if several different query
> execution pieces happen in parallel? What if we rework our distributed
> query engine from pull to push approach for performance reasons and there
> will be no JOINs in classical sense?
>
> This is why I think that we should start with very basic things. Something
> like:
> 1) Query exec count
> 2) Query exec time (first define what "time" means) - min, max, avg
> 3) Number of bytes exchanged between nodes during query execution
> 4) Number of returned rows - min, max, avg
>
> Once we have base numbers in place, we can think of carefully integrating
> and enhancing all pieces of query execution into more verbose formats,
> similar to query plans with relative weights in classical RDBMS systems.
>
> Thoughts?
>
>
>
> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]> wrote:
>
> > Vovan,
> >
> > Your metrics make perfect sense to me. However, I see a high demand for
> > JOINs based metrics especially from those who give a try to
> non-collocated
> > joins in production  and want to measure them somehow. This is why,
> > personally, I prefer to see the metrics below in the top priority list as
> > well:
> >
> > if a query was executed in the collocated or non-collocated mode. Three
> > results are valid: collocated, non-collocated, simple query (no joins).
> > non-collocated query: size of the data exchanged between the nodes to
> > complete a specific join. If there are multiple joins in the query we
> need
> > to provide this metric for every of them.
> > non-collocated and collocated query: a part of the time spent joining the
> > data. If there are multiple joins in the query we need to provide this
> > metric for every of them.
> >
> > As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> >
> > In any case, can we include timing information (map phase, reduce phase,
> > join phase) into an execution plan produced by H2? Are there any
> > implementation hooks?
> >
> > —
> > Denis
> >
> >
> > > On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]>
> > wrote:
> > >
> > > I think some of the metrics specified by Denis also make sense, so I
> > would
> > > add them as well. See below...
> > >
> > > On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]
> > <mailto:[hidden email]>>
> > > wrote:
> > >
> > >> Denis,
> > >>
> > >> Query execution is complex process involving different stages which
> are
> > not
> > >> very easy to match with each other. Especially provided that any node
> > can
> > >> leave topology at any time. Another problem is that engine evolves and
> > >> metrics like "did a query do broadcast or unicast" may easily become
> > >> useless at some point, because for example there will be neither
> > unicast,
> > >> nor broadast, but something different. On the other hand I completely
> > agree
> > >> that performance monitoring is essential part of any mature DBMS.
> > >>
> > >> I would start with metrics which are both very basic and easy to
> > implement
> > >> at the same time. For example we can add fingerprint (hash) to every
> > query
> > >> which will be used to join "map" and "reduce" parts with each other
> and
> > add
> > >> the following basic metrics:
> > >> 1) Execution count for particular query
> > >> 2) Number of map nodes - min, max, avg
> > >>
> > >
> > > (1) and (2) makes sense
> > >
> > >
> > >> 3) Map step duration (if applicable) - min, max,
> > >
> > > 4) Reduce step duration (if applicable) - min, max, avg
> > >>
> > >
> > > Not sure if (3) and (4) are needed. I would only add them if they are
> > easy
> > > to implement.
> > >
> > > I would also add these:
> > >
> > > 5) Collocated: yes/no
> > > 6) last execution time
> > > 7) min/max/average execution duration
> > >
> > >
> > >>
> > >> Once done users will be able to get statistics for particular queries.
> > >>
> > >> Vladimir.
> > >>
> > >>
> > >> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
> wrote:
> > >>
> > >>> BTW,
> > >>>
> > >>> What if we expose per-query metrics below as a part of EXPLAIN
> ANALYZE?
> > >>> Sergi, is this feasible?
> > >>>
> > >>> —
> > >>> Denis
> > >>>
> > >>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> > >>>>
> > >>>> Igniters,
> > >>>>
> > >>>> Let’s shed more light on SQL query execution internals introducing a
> > >> set
> > >>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757
> ).
> > >>>>
> > >>>> Per-query metrics. Total history size is defined by
> > >> *CacheConfiguration.
> > >>> getQueryDetailMetricsSize*:
> > >>>> * if a query was executed in the collocated or non-collocated mode.
> > >>> Three results are valid: collocated, non-collocated, simple query (no
> > >>> joins).
> > >>>> * non-collocated query: size of the data exchanged between the nodes
> > to
> > >>> complete a join.
> > >>>> * non-collocated query: did a query do broadcast or unicast to get
> > data
> > >>> needed to complete a join.
> > >>>> * non-collocated and collocated query: a part of the time spent
> > joining
> > >>> the data.
> > >>>>
> > >>>> CacheMetrics:
> > >>>> * an average number of executed SQL queries (collocated,
> > >> non-collocated,
> > >>> simple query (no joins)).
> > >>>>
> > >>>> Please don’t hesitate do share suggest another metrics or improve
> > >>> proposed ones.
> > >>>>
> > >>>> —
> > >>>> Denis
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
In reply to this post by Vladimir Ozerov
Vovan,

When I’m speaking of JOIN metrics I’m simply assume that we need to add metrics relevant for queries with joins, metrics that will help us get more insights on non-collocated and collocated joins execution flow.

> 1) Query exec count
> 2) Query exec time (first define what "time" means) - min, max, avg

Total query execution time might not be helpful in the trickiest cases. What if you have multiple joins in your query? How do I know which one contributes to the execution most?

So, I would do a breakdown having total time, map time, per-join time, reduce time. Hope it’s possible. If it’s unclear how to support everything at the first place then it’s a different question. Let’s create several tickets and start implementing everything gracefully.

> 3) Number of bytes exchanged between nodes during query execution

It will be really helpful to make a breakdown showing a number of bytes exchanged per-join (physical join). Again, if you believe it makes sense to do the breakdown later then let’s create an additional ticket for the 2 tier metrics then.

> 4) Number of returned rows - min, max, avg


Plus, let’s add the following to the list:

5) Collocated: yes or no


Denis

> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]> wrote:
>
> Denis,
>
> The main problem with suggested metrics is that they implies that ceratin
> internal mechanics work in predefined way. For example, what is JOIN
> metrics? There are no guarantees that JOIN in user's query will be
> translated to a real physical join. What if several different query
> execution pieces happen in parallel? What if we rework our distributed
> query engine from pull to push approach for performance reasons and there
> will be no JOINs in classical sense?
>
> This is why I think that we should start with very basic things. Something
> like:
> 1) Query exec count
> 2) Query exec time (first define what "time" means) - min, max, avg
> 3) Number of bytes exchanged between nodes during query execution
> 4) Number of returned rows - min, max, avg
>
> Once we have base numbers in place, we can think of carefully integrating
> and enhancing all pieces of query execution into more verbose formats,
> similar to query plans with relative weights in classical RDBMS systems.
>
> Thoughts?
>
>
>
> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]> wrote:
>
>> Vovan,
>>
>> Your metrics make perfect sense to me. However, I see a high demand for
>> JOINs based metrics especially from those who give a try to non-collocated
>> joins in production  and want to measure them somehow. This is why,
>> personally, I prefer to see the metrics below in the top priority list as
>> well:
>>
>> if a query was executed in the collocated or non-collocated mode. Three
>> results are valid: collocated, non-collocated, simple query (no joins).
>> non-collocated query: size of the data exchanged between the nodes to
>> complete a specific join. If there are multiple joins in the query we need
>> to provide this metric for every of them.
>> non-collocated and collocated query: a part of the time spent joining the
>> data. If there are multiple joins in the query we need to provide this
>> metric for every of them.
>>
>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>>
>> In any case, can we include timing information (map phase, reduce phase,
>> join phase) into an execution plan produced by H2? Are there any
>> implementation hooks?
>>
>> —
>> Denis
>>
>>
>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]>
>> wrote:
>>>
>>> I think some of the metrics specified by Denis also make sense, so I
>> would
>>> add them as well. See below...
>>>
>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]
>> <mailto:[hidden email]>>
>>> wrote:
>>>
>>>> Denis,
>>>>
>>>> Query execution is complex process involving different stages which are
>> not
>>>> very easy to match with each other. Especially provided that any node
>> can
>>>> leave topology at any time. Another problem is that engine evolves and
>>>> metrics like "did a query do broadcast or unicast" may easily become
>>>> useless at some point, because for example there will be neither
>> unicast,
>>>> nor broadast, but something different. On the other hand I completely
>> agree
>>>> that performance monitoring is essential part of any mature DBMS.
>>>>
>>>> I would start with metrics which are both very basic and easy to
>> implement
>>>> at the same time. For example we can add fingerprint (hash) to every
>> query
>>>> which will be used to join "map" and "reduce" parts with each other and
>> add
>>>> the following basic metrics:
>>>> 1) Execution count for particular query
>>>> 2) Number of map nodes - min, max, avg
>>>>
>>>
>>> (1) and (2) makes sense
>>>
>>>
>>>> 3) Map step duration (if applicable) - min, max,
>>>
>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>>
>>>
>>> Not sure if (3) and (4) are needed. I would only add them if they are
>> easy
>>> to implement.
>>>
>>> I would also add these:
>>>
>>> 5) Collocated: yes/no
>>> 6) last execution time
>>> 7) min/max/average execution duration
>>>
>>>
>>>>
>>>> Once done users will be able to get statistics for particular queries.
>>>>
>>>> Vladimir.
>>>>
>>>>
>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]> wrote:
>>>>
>>>>> BTW,
>>>>>
>>>>> What if we expose per-query metrics below as a part of EXPLAIN ANALYZE?
>>>>> Sergi, is this feasible?
>>>>>
>>>>> —
>>>>> Denis
>>>>>
>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
>>>>>>
>>>>>> Igniters,
>>>>>>
>>>>>> Let’s shed more light on SQL query execution internals introducing a
>>>> set
>>>>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757).
>>>>>>
>>>>>> Per-query metrics. Total history size is defined by
>>>> *CacheConfiguration.
>>>>> getQueryDetailMetricsSize*:
>>>>>> * if a query was executed in the collocated or non-collocated mode.
>>>>> Three results are valid: collocated, non-collocated, simple query (no
>>>>> joins).
>>>>>> * non-collocated query: size of the data exchanged between the nodes
>> to
>>>>> complete a join.
>>>>>> * non-collocated query: did a query do broadcast or unicast to get
>> data
>>>>> needed to complete a join.
>>>>>> * non-collocated and collocated query: a part of the time spent
>> joining
>>>>> the data.
>>>>>>
>>>>>> CacheMetrics:
>>>>>> * an average number of executed SQL queries (collocated,
>>>> non-collocated,
>>>>> simple query (no joins)).
>>>>>>
>>>>>> Please don’t hesitate do share suggest another metrics or improve
>>>>> proposed ones.
>>>>>>
>>>>>> —
>>>>>> Denis
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

Sergey Kozlov
One more comment:

In general the customer is interested in slow queries details thus we can
introduce an option which will allow to store only queries executed more
than NNN seconds. It may significantly reduce the the memory consumption
for history (but logging of all queries is still available if set that
option to 0).

On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]> wrote:

> Vovan,
>
> When I’m speaking of JOIN metrics I’m simply assume that we need to add
> metrics relevant for queries with joins, metrics that will help us get more
> insights on non-collocated and collocated joins execution flow.
>
> > 1) Query exec count
> > 2) Query exec time (first define what "time" means) - min, max, avg
>
> Total query execution time might not be helpful in the trickiest cases.
> What if you have multiple joins in your query? How do I know which one
> contributes to the execution most?
>
> So, I would do a breakdown having total time, map time, per-join time,
> reduce time. Hope it’s possible. If it’s unclear how to support everything
> at the first place then it’s a different question. Let’s create several
> tickets and start implementing everything gracefully.
>
> > 3) Number of bytes exchanged between nodes during query execution
>
> It will be really helpful to make a breakdown showing a number of bytes
> exchanged per-join (physical join). Again, if you believe it makes sense to
> do the breakdown later then let’s create an additional ticket for the 2
> tier metrics then.
>
> > 4) Number of returned rows - min, max, avg
>
>
> Plus, let’s add the following to the list:
>
> 5) Collocated: yes or no
>
> —
> Denis
>
> > On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
> wrote:
> >
> > Denis,
> >
> > The main problem with suggested metrics is that they implies that ceratin
> > internal mechanics work in predefined way. For example, what is JOIN
> > metrics? There are no guarantees that JOIN in user's query will be
> > translated to a real physical join. What if several different query
> > execution pieces happen in parallel? What if we rework our distributed
> > query engine from pull to push approach for performance reasons and there
> > will be no JOINs in classical sense?
> >
> > This is why I think that we should start with very basic things.
> Something
> > like:
> > 1) Query exec count
> > 2) Query exec time (first define what "time" means) - min, max, avg
> > 3) Number of bytes exchanged between nodes during query execution
> > 4) Number of returned rows - min, max, avg
> >
> > Once we have base numbers in place, we can think of carefully integrating
> > and enhancing all pieces of query execution into more verbose formats,
> > similar to query plans with relative weights in classical RDBMS systems.
> >
> > Thoughts?
> >
> >
> >
> > On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]> wrote:
> >
> >> Vovan,
> >>
> >> Your metrics make perfect sense to me. However, I see a high demand for
> >> JOINs based metrics especially from those who give a try to
> non-collocated
> >> joins in production  and want to measure them somehow. This is why,
> >> personally, I prefer to see the metrics below in the top priority list
> as
> >> well:
> >>
> >> if a query was executed in the collocated or non-collocated mode. Three
> >> results are valid: collocated, non-collocated, simple query (no joins).
> >> non-collocated query: size of the data exchanged between the nodes to
> >> complete a specific join. If there are multiple joins in the query we
> need
> >> to provide this metric for every of them.
> >> non-collocated and collocated query: a part of the time spent joining
> the
> >> data. If there are multiple joins in the query we need to provide this
> >> metric for every of them.
> >>
> >> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> >>
> >> In any case, can we include timing information (map phase, reduce phase,
> >> join phase) into an execution plan produced by H2? Are there any
> >> implementation hooks?
> >>
> >> —
> >> Denis
> >>
> >>
> >>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <[hidden email]>
> >> wrote:
> >>>
> >>> I think some of the metrics specified by Denis also make sense, so I
> >> would
> >>> add them as well. See below...
> >>>
> >>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <[hidden email]
> >> <mailto:[hidden email]>>
> >>> wrote:
> >>>
> >>>> Denis,
> >>>>
> >>>> Query execution is complex process involving different stages which
> are
> >> not
> >>>> very easy to match with each other. Especially provided that any node
> >> can
> >>>> leave topology at any time. Another problem is that engine evolves and
> >>>> metrics like "did a query do broadcast or unicast" may easily become
> >>>> useless at some point, because for example there will be neither
> >> unicast,
> >>>> nor broadast, but something different. On the other hand I completely
> >> agree
> >>>> that performance monitoring is essential part of any mature DBMS.
> >>>>
> >>>> I would start with metrics which are both very basic and easy to
> >> implement
> >>>> at the same time. For example we can add fingerprint (hash) to every
> >> query
> >>>> which will be used to join "map" and "reduce" parts with each other
> and
> >> add
> >>>> the following basic metrics:
> >>>> 1) Execution count for particular query
> >>>> 2) Number of map nodes - min, max, avg
> >>>>
> >>>
> >>> (1) and (2) makes sense
> >>>
> >>>
> >>>> 3) Map step duration (if applicable) - min, max,
> >>>
> >>> 4) Reduce step duration (if applicable) - min, max, avg
> >>>>
> >>>
> >>> Not sure if (3) and (4) are needed. I would only add them if they are
> >> easy
> >>> to implement.
> >>>
> >>> I would also add these:
> >>>
> >>> 5) Collocated: yes/no
> >>> 6) last execution time
> >>> 7) min/max/average execution duration
> >>>
> >>>
> >>>>
> >>>> Once done users will be able to get statistics for particular queries.
> >>>>
> >>>> Vladimir.
> >>>>
> >>>>
> >>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
> wrote:
> >>>>
> >>>>> BTW,
> >>>>>
> >>>>> What if we expose per-query metrics below as a part of EXPLAIN
> ANALYZE?
> >>>>> Sergi, is this feasible?
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]> wrote:
> >>>>>>
> >>>>>> Igniters,
> >>>>>>
> >>>>>> Let’s shed more light on SQL query execution internals introducing a
> >>>> set
> >>>>> of useful metrics (https://issues.apache.org/jira/browse/IGNITE-4757
> ).
> >>>>>>
> >>>>>> Per-query metrics. Total history size is defined by
> >>>> *CacheConfiguration.
> >>>>> getQueryDetailMetricsSize*:
> >>>>>> * if a query was executed in the collocated or non-collocated mode.
> >>>>> Three results are valid: collocated, non-collocated, simple query (no
> >>>>> joins).
> >>>>>> * non-collocated query: size of the data exchanged between the nodes
> >> to
> >>>>> complete a join.
> >>>>>> * non-collocated query: did a query do broadcast or unicast to get
> >> data
> >>>>> needed to complete a join.
> >>>>>> * non-collocated and collocated query: a part of the time spent
> >> joining
> >>>>> the data.
> >>>>>>
> >>>>>> CacheMetrics:
> >>>>>> * an average number of executed SQL queries (collocated,
> >>>> non-collocated,
> >>>>> simple query (no joins)).
> >>>>>>
> >>>>>> Please don’t hesitate do share suggest another metrics or improve
> >>>>> proposed ones.
> >>>>>>
> >>>>>> —
> >>>>>> Denis
> >>
> >>
>
>


--
Sergey Kozlov
GridGain Systems
www.gridgain.com
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

Valentin Kulichenko
Sergey, that's great idea! Generally, user is not interested much in some
average numbers, especially in case of SQL queries. What they need is a
list of slow queries and detailed information about the execution flow of
these particular queries.

-Val

On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]> wrote:

> One more comment:
>
> In general the customer is interested in slow queries details thus we can
> introduce an option which will allow to store only queries executed more
> than NNN seconds. It may significantly reduce the the memory consumption
> for history (but logging of all queries is still available if set that
> option to 0).
>
> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]> wrote:
>
> > Vovan,
> >
> > When I’m speaking of JOIN metrics I’m simply assume that we need to add
> > metrics relevant for queries with joins, metrics that will help us get
> more
> > insights on non-collocated and collocated joins execution flow.
> >
> > > 1) Query exec count
> > > 2) Query exec time (first define what "time" means) - min, max, avg
> >
> > Total query execution time might not be helpful in the trickiest cases.
> > What if you have multiple joins in your query? How do I know which one
> > contributes to the execution most?
> >
> > So, I would do a breakdown having total time, map time, per-join time,
> > reduce time. Hope it’s possible. If it’s unclear how to support
> everything
> > at the first place then it’s a different question. Let’s create several
> > tickets and start implementing everything gracefully.
> >
> > > 3) Number of bytes exchanged between nodes during query execution
> >
> > It will be really helpful to make a breakdown showing a number of bytes
> > exchanged per-join (physical join). Again, if you believe it makes sense
> to
> > do the breakdown later then let’s create an additional ticket for the 2
> > tier metrics then.
> >
> > > 4) Number of returned rows - min, max, avg
> >
> >
> > Plus, let’s add the following to the list:
> >
> > 5) Collocated: yes or no
> >
> > —
> > Denis
> >
> > > On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
> > wrote:
> > >
> > > Denis,
> > >
> > > The main problem with suggested metrics is that they implies that
> ceratin
> > > internal mechanics work in predefined way. For example, what is JOIN
> > > metrics? There are no guarantees that JOIN in user's query will be
> > > translated to a real physical join. What if several different query
> > > execution pieces happen in parallel? What if we rework our distributed
> > > query engine from pull to push approach for performance reasons and
> there
> > > will be no JOINs in classical sense?
> > >
> > > This is why I think that we should start with very basic things.
> > Something
> > > like:
> > > 1) Query exec count
> > > 2) Query exec time (first define what "time" means) - min, max, avg
> > > 3) Number of bytes exchanged between nodes during query execution
> > > 4) Number of returned rows - min, max, avg
> > >
> > > Once we have base numbers in place, we can think of carefully
> integrating
> > > and enhancing all pieces of query execution into more verbose formats,
> > > similar to query plans with relative weights in classical RDBMS
> systems.
> > >
> > > Thoughts?
> > >
> > >
> > >
> > > On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
> wrote:
> > >
> > >> Vovan,
> > >>
> > >> Your metrics make perfect sense to me. However, I see a high demand
> for
> > >> JOINs based metrics especially from those who give a try to
> > non-collocated
> > >> joins in production  and want to measure them somehow. This is why,
> > >> personally, I prefer to see the metrics below in the top priority list
> > as
> > >> well:
> > >>
> > >> if a query was executed in the collocated or non-collocated mode.
> Three
> > >> results are valid: collocated, non-collocated, simple query (no
> joins).
> > >> non-collocated query: size of the data exchanged between the nodes to
> > >> complete a specific join. If there are multiple joins in the query we
> > need
> > >> to provide this metric for every of them.
> > >> non-collocated and collocated query: a part of the time spent joining
> > the
> > >> data. If there are multiple joins in the query we need to provide this
> > >> metric for every of them.
> > >>
> > >> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> > >>
> > >> In any case, can we include timing information (map phase, reduce
> phase,
> > >> join phase) into an execution plan produced by H2? Are there any
> > >> implementation hooks?
> > >>
> > >> —
> > >> Denis
> > >>
> > >>
> > >>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> [hidden email]>
> > >> wrote:
> > >>>
> > >>> I think some of the metrics specified by Denis also make sense, so I
> > >> would
> > >>> add them as well. See below...
> > >>>
> > >>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> [hidden email]
> > >> <mailto:[hidden email]>>
> > >>> wrote:
> > >>>
> > >>>> Denis,
> > >>>>
> > >>>> Query execution is complex process involving different stages which
> > are
> > >> not
> > >>>> very easy to match with each other. Especially provided that any
> node
> > >> can
> > >>>> leave topology at any time. Another problem is that engine evolves
> and
> > >>>> metrics like "did a query do broadcast or unicast" may easily become
> > >>>> useless at some point, because for example there will be neither
> > >> unicast,
> > >>>> nor broadast, but something different. On the other hand I
> completely
> > >> agree
> > >>>> that performance monitoring is essential part of any mature DBMS.
> > >>>>
> > >>>> I would start with metrics which are both very basic and easy to
> > >> implement
> > >>>> at the same time. For example we can add fingerprint (hash) to every
> > >> query
> > >>>> which will be used to join "map" and "reduce" parts with each other
> > and
> > >> add
> > >>>> the following basic metrics:
> > >>>> 1) Execution count for particular query
> > >>>> 2) Number of map nodes - min, max, avg
> > >>>>
> > >>>
> > >>> (1) and (2) makes sense
> > >>>
> > >>>
> > >>>> 3) Map step duration (if applicable) - min, max,
> > >>>
> > >>> 4) Reduce step duration (if applicable) - min, max, avg
> > >>>>
> > >>>
> > >>> Not sure if (3) and (4) are needed. I would only add them if they are
> > >> easy
> > >>> to implement.
> > >>>
> > >>> I would also add these:
> > >>>
> > >>> 5) Collocated: yes/no
> > >>> 6) last execution time
> > >>> 7) min/max/average execution duration
> > >>>
> > >>>
> > >>>>
> > >>>> Once done users will be able to get statistics for particular
> queries.
> > >>>>
> > >>>> Vladimir.
> > >>>>
> > >>>>
> > >>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
> > wrote:
> > >>>>
> > >>>>> BTW,
> > >>>>>
> > >>>>> What if we expose per-query metrics below as a part of EXPLAIN
> > ANALYZE?
> > >>>>> Sergi, is this feasible?
> > >>>>>
> > >>>>> —
> > >>>>> Denis
> > >>>>>
> > >>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
> wrote:
> > >>>>>>
> > >>>>>> Igniters,
> > >>>>>>
> > >>>>>> Let’s shed more light on SQL query execution internals
> introducing a
> > >>>> set
> > >>>>> of useful metrics (https://issues.apache.org/
> jira/browse/IGNITE-4757
> > ).
> > >>>>>>
> > >>>>>> Per-query metrics. Total history size is defined by
> > >>>> *CacheConfiguration.
> > >>>>> getQueryDetailMetricsSize*:
> > >>>>>> * if a query was executed in the collocated or non-collocated
> mode.
> > >>>>> Three results are valid: collocated, non-collocated, simple query
> (no
> > >>>>> joins).
> > >>>>>> * non-collocated query: size of the data exchanged between the
> nodes
> > >> to
> > >>>>> complete a join.
> > >>>>>> * non-collocated query: did a query do broadcast or unicast to get
> > >> data
> > >>>>> needed to complete a join.
> > >>>>>> * non-collocated and collocated query: a part of the time spent
> > >> joining
> > >>>>> the data.
> > >>>>>>
> > >>>>>> CacheMetrics:
> > >>>>>> * an average number of executed SQL queries (collocated,
> > >>>> non-collocated,
> > >>>>> simple query (no joins)).
> > >>>>>>
> > >>>>>> Please don’t hesitate do share suggest another metrics or improve
> > >>>>> proposed ones.
> > >>>>>>
> > >>>>>> —
> > >>>>>> Denis
> > >>
> > >>
> >
> >
>
>
> --
> Sergey Kozlov
> GridGain Systems
> www.gridgain.com
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
Sergey, agree, good point!

Igniters, any other thoughts before we wrap up the discussion updating the ticket content?


Denis

> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <[hidden email]> wrote:
>
> Sergey, that's great idea! Generally, user is not interested much in some
> average numbers, especially in case of SQL queries. What they need is a
> list of slow queries and detailed information about the execution flow of
> these particular queries.
>
> -Val
>
> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]> wrote:
>
>> One more comment:
>>
>> In general the customer is interested in slow queries details thus we can
>> introduce an option which will allow to store only queries executed more
>> than NNN seconds. It may significantly reduce the the memory consumption
>> for history (but logging of all queries is still available if set that
>> option to 0).
>>
>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]> wrote:
>>
>>> Vovan,
>>>
>>> When I’m speaking of JOIN metrics I’m simply assume that we need to add
>>> metrics relevant for queries with joins, metrics that will help us get
>> more
>>> insights on non-collocated and collocated joins execution flow.
>>>
>>>> 1) Query exec count
>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>
>>> Total query execution time might not be helpful in the trickiest cases.
>>> What if you have multiple joins in your query? How do I know which one
>>> contributes to the execution most?
>>>
>>> So, I would do a breakdown having total time, map time, per-join time,
>>> reduce time. Hope it’s possible. If it’s unclear how to support
>> everything
>>> at the first place then it’s a different question. Let’s create several
>>> tickets and start implementing everything gracefully.
>>>
>>>> 3) Number of bytes exchanged between nodes during query execution
>>>
>>> It will be really helpful to make a breakdown showing a number of bytes
>>> exchanged per-join (physical join). Again, if you believe it makes sense
>> to
>>> do the breakdown later then let’s create an additional ticket for the 2
>>> tier metrics then.
>>>
>>>> 4) Number of returned rows - min, max, avg
>>>
>>>
>>> Plus, let’s add the following to the list:
>>>
>>> 5) Collocated: yes or no
>>>
>>> —
>>> Denis
>>>
>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
>>> wrote:
>>>>
>>>> Denis,
>>>>
>>>> The main problem with suggested metrics is that they implies that
>> ceratin
>>>> internal mechanics work in predefined way. For example, what is JOIN
>>>> metrics? There are no guarantees that JOIN in user's query will be
>>>> translated to a real physical join. What if several different query
>>>> execution pieces happen in parallel? What if we rework our distributed
>>>> query engine from pull to push approach for performance reasons and
>> there
>>>> will be no JOINs in classical sense?
>>>>
>>>> This is why I think that we should start with very basic things.
>>> Something
>>>> like:
>>>> 1) Query exec count
>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>> 3) Number of bytes exchanged between nodes during query execution
>>>> 4) Number of returned rows - min, max, avg
>>>>
>>>> Once we have base numbers in place, we can think of carefully
>> integrating
>>>> and enhancing all pieces of query execution into more verbose formats,
>>>> similar to query plans with relative weights in classical RDBMS
>> systems.
>>>>
>>>> Thoughts?
>>>>
>>>>
>>>>
>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
>> wrote:
>>>>
>>>>> Vovan,
>>>>>
>>>>> Your metrics make perfect sense to me. However, I see a high demand
>> for
>>>>> JOINs based metrics especially from those who give a try to
>>> non-collocated
>>>>> joins in production  and want to measure them somehow. This is why,
>>>>> personally, I prefer to see the metrics below in the top priority list
>>> as
>>>>> well:
>>>>>
>>>>> if a query was executed in the collocated or non-collocated mode.
>> Three
>>>>> results are valid: collocated, non-collocated, simple query (no
>> joins).
>>>>> non-collocated query: size of the data exchanged between the nodes to
>>>>> complete a specific join. If there are multiple joins in the query we
>>> need
>>>>> to provide this metric for every of them.
>>>>> non-collocated and collocated query: a part of the time spent joining
>>> the
>>>>> data. If there are multiple joins in the query we need to provide this
>>>>> metric for every of them.
>>>>>
>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>>>>>
>>>>> In any case, can we include timing information (map phase, reduce
>> phase,
>>>>> join phase) into an execution plan produced by H2? Are there any
>>>>> implementation hooks?
>>>>>
>>>>> —
>>>>> Denis
>>>>>
>>>>>
>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
>> [hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> I think some of the metrics specified by Denis also make sense, so I
>>>>> would
>>>>>> add them as well. See below...
>>>>>>
>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
>> [hidden email]
>>>>> <mailto:[hidden email]>>
>>>>>> wrote:
>>>>>>
>>>>>>> Denis,
>>>>>>>
>>>>>>> Query execution is complex process involving different stages which
>>> are
>>>>> not
>>>>>>> very easy to match with each other. Especially provided that any
>> node
>>>>> can
>>>>>>> leave topology at any time. Another problem is that engine evolves
>> and
>>>>>>> metrics like "did a query do broadcast or unicast" may easily become
>>>>>>> useless at some point, because for example there will be neither
>>>>> unicast,
>>>>>>> nor broadast, but something different. On the other hand I
>> completely
>>>>> agree
>>>>>>> that performance monitoring is essential part of any mature DBMS.
>>>>>>>
>>>>>>> I would start with metrics which are both very basic and easy to
>>>>> implement
>>>>>>> at the same time. For example we can add fingerprint (hash) to every
>>>>> query
>>>>>>> which will be used to join "map" and "reduce" parts with each other
>>> and
>>>>> add
>>>>>>> the following basic metrics:
>>>>>>> 1) Execution count for particular query
>>>>>>> 2) Number of map nodes - min, max, avg
>>>>>>>
>>>>>>
>>>>>> (1) and (2) makes sense
>>>>>>
>>>>>>
>>>>>>> 3) Map step duration (if applicable) - min, max,
>>>>>>
>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>>>>>
>>>>>>
>>>>>> Not sure if (3) and (4) are needed. I would only add them if they are
>>>>> easy
>>>>>> to implement.
>>>>>>
>>>>>> I would also add these:
>>>>>>
>>>>>> 5) Collocated: yes/no
>>>>>> 6) last execution time
>>>>>> 7) min/max/average execution duration
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Once done users will be able to get statistics for particular
>> queries.
>>>>>>>
>>>>>>> Vladimir.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
>>> wrote:
>>>>>>>
>>>>>>>> BTW,
>>>>>>>>
>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
>>> ANALYZE?
>>>>>>>> Sergi, is this feasible?
>>>>>>>>
>>>>>>>> —
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
>> wrote:
>>>>>>>>>
>>>>>>>>> Igniters,
>>>>>>>>>
>>>>>>>>> Let’s shed more light on SQL query execution internals
>> introducing a
>>>>>>> set
>>>>>>>> of useful metrics (https://issues.apache.org/
>> jira/browse/IGNITE-4757
>>> ).
>>>>>>>>>
>>>>>>>>> Per-query metrics. Total history size is defined by
>>>>>>> *CacheConfiguration.
>>>>>>>> getQueryDetailMetricsSize*:
>>>>>>>>> * if a query was executed in the collocated or non-collocated
>> mode.
>>>>>>>> Three results are valid: collocated, non-collocated, simple query
>> (no
>>>>>>>> joins).
>>>>>>>>> * non-collocated query: size of the data exchanged between the
>> nodes
>>>>> to
>>>>>>>> complete a join.
>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to get
>>>>> data
>>>>>>>> needed to complete a join.
>>>>>>>>> * non-collocated and collocated query: a part of the time spent
>>>>> joining
>>>>>>>> the data.
>>>>>>>>>
>>>>>>>>> CacheMetrics:
>>>>>>>>> * an average number of executed SQL queries (collocated,
>>>>>>> non-collocated,
>>>>>>>> simple query (no joins)).
>>>>>>>>>
>>>>>>>>> Please don’t hesitate do share suggest another metrics or improve
>>>>>>>> proposed ones.
>>>>>>>>>
>>>>>>>>> —
>>>>>>>>> Denis
>>>>>
>>>>>
>>>
>>>
>>
>>
>> --
>> Sergey Kozlov
>> GridGain Systems
>> www.gridgain.com
>>

Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dsetrakyan
Hm... as a user I would be interested to know that, say, 95% of my "select
* from sometable where..." query executes under 10ms or so.

I think holding some history is important and is not that hard to implement.

D.

On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[hidden email]> wrote:

> Sergey, agree, good point!
>
> Igniters, any other thoughts before we wrap up the discussion updating the
> ticket content?
>
> —
> Denis
>
> > On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
> [hidden email]> wrote:
> >
> > Sergey, that's great idea! Generally, user is not interested much in some
> > average numbers, especially in case of SQL queries. What they need is a
> > list of slow queries and detailed information about the execution flow of
> > these particular queries.
> >
> > -Val
> >
> > On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]>
> wrote:
> >
> >> One more comment:
> >>
> >> In general the customer is interested in slow queries details thus we
> can
> >> introduce an option which will allow to store only queries executed more
> >> than NNN seconds. It may significantly reduce the the memory consumption
> >> for history (but logging of all queries is still available if set that
> >> option to 0).
> >>
> >> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]> wrote:
> >>
> >>> Vovan,
> >>>
> >>> When I’m speaking of JOIN metrics I’m simply assume that we need to add
> >>> metrics relevant for queries with joins, metrics that will help us get
> >> more
> >>> insights on non-collocated and collocated joins execution flow.
> >>>
> >>>> 1) Query exec count
> >>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>
> >>> Total query execution time might not be helpful in the trickiest cases.
> >>> What if you have multiple joins in your query? How do I know which one
> >>> contributes to the execution most?
> >>>
> >>> So, I would do a breakdown having total time, map time, per-join time,
> >>> reduce time. Hope it’s possible. If it’s unclear how to support
> >> everything
> >>> at the first place then it’s a different question. Let’s create several
> >>> tickets and start implementing everything gracefully.
> >>>
> >>>> 3) Number of bytes exchanged between nodes during query execution
> >>>
> >>> It will be really helpful to make a breakdown showing a number of bytes
> >>> exchanged per-join (physical join). Again, if you believe it makes
> sense
> >> to
> >>> do the breakdown later then let’s create an additional ticket for the 2
> >>> tier metrics then.
> >>>
> >>>> 4) Number of returned rows - min, max, avg
> >>>
> >>>
> >>> Plus, let’s add the following to the list:
> >>>
> >>> 5) Collocated: yes or no
> >>>
> >>> —
> >>> Denis
> >>>
> >>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
> >>> wrote:
> >>>>
> >>>> Denis,
> >>>>
> >>>> The main problem with suggested metrics is that they implies that
> >> ceratin
> >>>> internal mechanics work in predefined way. For example, what is JOIN
> >>>> metrics? There are no guarantees that JOIN in user's query will be
> >>>> translated to a real physical join. What if several different query
> >>>> execution pieces happen in parallel? What if we rework our distributed
> >>>> query engine from pull to push approach for performance reasons and
> >> there
> >>>> will be no JOINs in classical sense?
> >>>>
> >>>> This is why I think that we should start with very basic things.
> >>> Something
> >>>> like:
> >>>> 1) Query exec count
> >>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>> 3) Number of bytes exchanged between nodes during query execution
> >>>> 4) Number of returned rows - min, max, avg
> >>>>
> >>>> Once we have base numbers in place, we can think of carefully
> >> integrating
> >>>> and enhancing all pieces of query execution into more verbose formats,
> >>>> similar to query plans with relative weights in classical RDBMS
> >> systems.
> >>>>
> >>>> Thoughts?
> >>>>
> >>>>
> >>>>
> >>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
> >> wrote:
> >>>>
> >>>>> Vovan,
> >>>>>
> >>>>> Your metrics make perfect sense to me. However, I see a high demand
> >> for
> >>>>> JOINs based metrics especially from those who give a try to
> >>> non-collocated
> >>>>> joins in production  and want to measure them somehow. This is why,
> >>>>> personally, I prefer to see the metrics below in the top priority
> list
> >>> as
> >>>>> well:
> >>>>>
> >>>>> if a query was executed in the collocated or non-collocated mode.
> >> Three
> >>>>> results are valid: collocated, non-collocated, simple query (no
> >> joins).
> >>>>> non-collocated query: size of the data exchanged between the nodes to
> >>>>> complete a specific join. If there are multiple joins in the query we
> >>> need
> >>>>> to provide this metric for every of them.
> >>>>> non-collocated and collocated query: a part of the time spent joining
> >>> the
> >>>>> data. If there are multiple joins in the query we need to provide
> this
> >>>>> metric for every of them.
> >>>>>
> >>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> >>>>>
> >>>>> In any case, can we include timing information (map phase, reduce
> >> phase,
> >>>>> join phase) into an execution plan produced by H2? Are there any
> >>>>> implementation hooks?
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>
> >>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> >> [hidden email]>
> >>>>> wrote:
> >>>>>>
> >>>>>> I think some of the metrics specified by Denis also make sense, so I
> >>>>> would
> >>>>>> add them as well. See below...
> >>>>>>
> >>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> >> [hidden email]
> >>>>> <mailto:[hidden email]>>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Denis,
> >>>>>>>
> >>>>>>> Query execution is complex process involving different stages which
> >>> are
> >>>>> not
> >>>>>>> very easy to match with each other. Especially provided that any
> >> node
> >>>>> can
> >>>>>>> leave topology at any time. Another problem is that engine evolves
> >> and
> >>>>>>> metrics like "did a query do broadcast or unicast" may easily
> become
> >>>>>>> useless at some point, because for example there will be neither
> >>>>> unicast,
> >>>>>>> nor broadast, but something different. On the other hand I
> >> completely
> >>>>> agree
> >>>>>>> that performance monitoring is essential part of any mature DBMS.
> >>>>>>>
> >>>>>>> I would start with metrics which are both very basic and easy to
> >>>>> implement
> >>>>>>> at the same time. For example we can add fingerprint (hash) to
> every
> >>>>> query
> >>>>>>> which will be used to join "map" and "reduce" parts with each other
> >>> and
> >>>>> add
> >>>>>>> the following basic metrics:
> >>>>>>> 1) Execution count for particular query
> >>>>>>> 2) Number of map nodes - min, max, avg
> >>>>>>>
> >>>>>>
> >>>>>> (1) and (2) makes sense
> >>>>>>
> >>>>>>
> >>>>>>> 3) Map step duration (if applicable) - min, max,
> >>>>>>
> >>>>>> 4) Reduce step duration (if applicable) - min, max, avg
> >>>>>>>
> >>>>>>
> >>>>>> Not sure if (3) and (4) are needed. I would only add them if they
> are
> >>>>> easy
> >>>>>> to implement.
> >>>>>>
> >>>>>> I would also add these:
> >>>>>>
> >>>>>> 5) Collocated: yes/no
> >>>>>> 6) last execution time
> >>>>>> 7) min/max/average execution duration
> >>>>>>
> >>>>>>
> >>>>>>>
> >>>>>>> Once done users will be able to get statistics for particular
> >> queries.
> >>>>>>>
> >>>>>>> Vladimir.
> >>>>>>>
> >>>>>>>
> >>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
> >>> wrote:
> >>>>>>>
> >>>>>>>> BTW,
> >>>>>>>>
> >>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
> >>> ANALYZE?
> >>>>>>>> Sergi, is this feasible?
> >>>>>>>>
> >>>>>>>> —
> >>>>>>>> Denis
> >>>>>>>>
> >>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
> >> wrote:
> >>>>>>>>>
> >>>>>>>>> Igniters,
> >>>>>>>>>
> >>>>>>>>> Let’s shed more light on SQL query execution internals
> >> introducing a
> >>>>>>> set
> >>>>>>>> of useful metrics (https://issues.apache.org/
> >> jira/browse/IGNITE-4757
> >>> ).
> >>>>>>>>>
> >>>>>>>>> Per-query metrics. Total history size is defined by
> >>>>>>> *CacheConfiguration.
> >>>>>>>> getQueryDetailMetricsSize*:
> >>>>>>>>> * if a query was executed in the collocated or non-collocated
> >> mode.
> >>>>>>>> Three results are valid: collocated, non-collocated, simple query
> >> (no
> >>>>>>>> joins).
> >>>>>>>>> * non-collocated query: size of the data exchanged between the
> >> nodes
> >>>>> to
> >>>>>>>> complete a join.
> >>>>>>>>> * non-collocated query: did a query do broadcast or unicast to
> get
> >>>>> data
> >>>>>>>> needed to complete a join.
> >>>>>>>>> * non-collocated and collocated query: a part of the time spent
> >>>>> joining
> >>>>>>>> the data.
> >>>>>>>>>
> >>>>>>>>> CacheMetrics:
> >>>>>>>>> * an average number of executed SQL queries (collocated,
> >>>>>>> non-collocated,
> >>>>>>>> simple query (no joins)).
> >>>>>>>>>
> >>>>>>>>> Please don’t hesitate do share suggest another metrics or improve
> >>>>>>>> proposed ones.
> >>>>>>>>>
> >>>>>>>>> —
> >>>>>>>>> Denis
> >>>>>
> >>>>>
> >>>
> >>>
> >>
> >>
> >> --
> >> Sergey Kozlov
> >> GridGain Systems
> >> www.gridgain.com
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
What you’re saying should be default behavior. Plus, we can add a special parameter that will gather metrics for queries executed longer that N.


Denis

> On Mar 3, 2017, at 11:01 AM, Dmitriy Setrakyan <[hidden email]> wrote:
>
> Hm... as a user I would be interested to know that, say, 95% of my "select
> * from sometable where..." query executes under 10ms or so.
>
> I think holding some history is important and is not that hard to implement.
>
> D.
>
> On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[hidden email]> wrote:
>
>> Sergey, agree, good point!
>>
>> Igniters, any other thoughts before we wrap up the discussion updating the
>> ticket content?
>>
>> —
>> Denis
>>
>>> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
>> [hidden email]> wrote:
>>>
>>> Sergey, that's great idea! Generally, user is not interested much in some
>>> average numbers, especially in case of SQL queries. What they need is a
>>> list of slow queries and detailed information about the execution flow of
>>> these particular queries.
>>>
>>> -Val
>>>
>>> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]>
>> wrote:
>>>
>>>> One more comment:
>>>>
>>>> In general the customer is interested in slow queries details thus we
>> can
>>>> introduce an option which will allow to store only queries executed more
>>>> than NNN seconds. It may significantly reduce the the memory consumption
>>>> for history (but logging of all queries is still available if set that
>>>> option to 0).
>>>>
>>>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]> wrote:
>>>>
>>>>> Vovan,
>>>>>
>>>>> When I’m speaking of JOIN metrics I’m simply assume that we need to add
>>>>> metrics relevant for queries with joins, metrics that will help us get
>>>> more
>>>>> insights on non-collocated and collocated joins execution flow.
>>>>>
>>>>>> 1) Query exec count
>>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>>>
>>>>> Total query execution time might not be helpful in the trickiest cases.
>>>>> What if you have multiple joins in your query? How do I know which one
>>>>> contributes to the execution most?
>>>>>
>>>>> So, I would do a breakdown having total time, map time, per-join time,
>>>>> reduce time. Hope it’s possible. If it’s unclear how to support
>>>> everything
>>>>> at the first place then it’s a different question. Let’s create several
>>>>> tickets and start implementing everything gracefully.
>>>>>
>>>>>> 3) Number of bytes exchanged between nodes during query execution
>>>>>
>>>>> It will be really helpful to make a breakdown showing a number of bytes
>>>>> exchanged per-join (physical join). Again, if you believe it makes
>> sense
>>>> to
>>>>> do the breakdown later then let’s create an additional ticket for the 2
>>>>> tier metrics then.
>>>>>
>>>>>> 4) Number of returned rows - min, max, avg
>>>>>
>>>>>
>>>>> Plus, let’s add the following to the list:
>>>>>
>>>>> 5) Collocated: yes or no
>>>>>
>>>>> —
>>>>> Denis
>>>>>
>>>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> Denis,
>>>>>>
>>>>>> The main problem with suggested metrics is that they implies that
>>>> ceratin
>>>>>> internal mechanics work in predefined way. For example, what is JOIN
>>>>>> metrics? There are no guarantees that JOIN in user's query will be
>>>>>> translated to a real physical join. What if several different query
>>>>>> execution pieces happen in parallel? What if we rework our distributed
>>>>>> query engine from pull to push approach for performance reasons and
>>>> there
>>>>>> will be no JOINs in classical sense?
>>>>>>
>>>>>> This is why I think that we should start with very basic things.
>>>>> Something
>>>>>> like:
>>>>>> 1) Query exec count
>>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>>>> 3) Number of bytes exchanged between nodes during query execution
>>>>>> 4) Number of returned rows - min, max, avg
>>>>>>
>>>>>> Once we have base numbers in place, we can think of carefully
>>>> integrating
>>>>>> and enhancing all pieces of query execution into more verbose formats,
>>>>>> similar to query plans with relative weights in classical RDBMS
>>>> systems.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
>>>> wrote:
>>>>>>
>>>>>>> Vovan,
>>>>>>>
>>>>>>> Your metrics make perfect sense to me. However, I see a high demand
>>>> for
>>>>>>> JOINs based metrics especially from those who give a try to
>>>>> non-collocated
>>>>>>> joins in production  and want to measure them somehow. This is why,
>>>>>>> personally, I prefer to see the metrics below in the top priority
>> list
>>>>> as
>>>>>>> well:
>>>>>>>
>>>>>>> if a query was executed in the collocated or non-collocated mode.
>>>> Three
>>>>>>> results are valid: collocated, non-collocated, simple query (no
>>>> joins).
>>>>>>> non-collocated query: size of the data exchanged between the nodes to
>>>>>>> complete a specific join. If there are multiple joins in the query we
>>>>> need
>>>>>>> to provide this metric for every of them.
>>>>>>> non-collocated and collocated query: a part of the time spent joining
>>>>> the
>>>>>>> data. If there are multiple joins in the query we need to provide
>> this
>>>>>>> metric for every of them.
>>>>>>>
>>>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>>>>>>>
>>>>>>> In any case, can we include timing information (map phase, reduce
>>>> phase,
>>>>>>> join phase) into an execution plan produced by H2? Are there any
>>>>>>> implementation hooks?
>>>>>>>
>>>>>>> —
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I think some of the metrics specified by Denis also make sense, so I
>>>>>>> would
>>>>>>>> add them as well. See below...
>>>>>>>>
>>>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
>>>> [hidden email]
>>>>>>> <mailto:[hidden email]>>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Denis,
>>>>>>>>>
>>>>>>>>> Query execution is complex process involving different stages which
>>>>> are
>>>>>>> not
>>>>>>>>> very easy to match with each other. Especially provided that any
>>>> node
>>>>>>> can
>>>>>>>>> leave topology at any time. Another problem is that engine evolves
>>>> and
>>>>>>>>> metrics like "did a query do broadcast or unicast" may easily
>> become
>>>>>>>>> useless at some point, because for example there will be neither
>>>>>>> unicast,
>>>>>>>>> nor broadast, but something different. On the other hand I
>>>> completely
>>>>>>> agree
>>>>>>>>> that performance monitoring is essential part of any mature DBMS.
>>>>>>>>>
>>>>>>>>> I would start with metrics which are both very basic and easy to
>>>>>>> implement
>>>>>>>>> at the same time. For example we can add fingerprint (hash) to
>> every
>>>>>>> query
>>>>>>>>> which will be used to join "map" and "reduce" parts with each other
>>>>> and
>>>>>>> add
>>>>>>>>> the following basic metrics:
>>>>>>>>> 1) Execution count for particular query
>>>>>>>>> 2) Number of map nodes - min, max, avg
>>>>>>>>>
>>>>>>>>
>>>>>>>> (1) and (2) makes sense
>>>>>>>>
>>>>>>>>
>>>>>>>>> 3) Map step duration (if applicable) - min, max,
>>>>>>>>
>>>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>>>>>>>
>>>>>>>>
>>>>>>>> Not sure if (3) and (4) are needed. I would only add them if they
>> are
>>>>>>> easy
>>>>>>>> to implement.
>>>>>>>>
>>>>>>>> I would also add these:
>>>>>>>>
>>>>>>>> 5) Collocated: yes/no
>>>>>>>> 6) last execution time
>>>>>>>> 7) min/max/average execution duration
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Once done users will be able to get statistics for particular
>>>> queries.
>>>>>>>>>
>>>>>>>>> Vladimir.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> BTW,
>>>>>>>>>>
>>>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
>>>>> ANALYZE?
>>>>>>>>>> Sergi, is this feasible?
>>>>>>>>>>
>>>>>>>>>> —
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Igniters,
>>>>>>>>>>>
>>>>>>>>>>> Let’s shed more light on SQL query execution internals
>>>> introducing a
>>>>>>>>> set
>>>>>>>>>> of useful metrics (https://issues.apache.org/
>>>> jira/browse/IGNITE-4757
>>>>> ).
>>>>>>>>>>>
>>>>>>>>>>> Per-query metrics. Total history size is defined by
>>>>>>>>> *CacheConfiguration.
>>>>>>>>>> getQueryDetailMetricsSize*:
>>>>>>>>>>> * if a query was executed in the collocated or non-collocated
>>>> mode.
>>>>>>>>>> Three results are valid: collocated, non-collocated, simple query
>>>> (no
>>>>>>>>>> joins).
>>>>>>>>>>> * non-collocated query: size of the data exchanged between the
>>>> nodes
>>>>>>> to
>>>>>>>>>> complete a join.
>>>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to
>> get
>>>>>>> data
>>>>>>>>>> needed to complete a join.
>>>>>>>>>>> * non-collocated and collocated query: a part of the time spent
>>>>>>> joining
>>>>>>>>>> the data.
>>>>>>>>>>>
>>>>>>>>>>> CacheMetrics:
>>>>>>>>>>> * an average number of executed SQL queries (collocated,
>>>>>>>>> non-collocated,
>>>>>>>>>> simple query (no joins)).
>>>>>>>>>>>
>>>>>>>>>>> Please don’t hesitate do share suggest another metrics or improve
>>>>>>>>>> proposed ones.
>>>>>>>>>>>
>>>>>>>>>>> —
>>>>>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Sergey Kozlov
>>>> GridGain Systems
>>>> www.gridgain.com
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dsetrakyan
On Fri, Mar 3, 2017 at 11:07 AM, Denis Magda <[hidden email]> wrote:

> What you’re saying should be default behavior. Plus, we can add a special
> parameter that will gather metrics for queries executed longer that N.
>

Agree. I thought we already have that in some form. I remember seeing a
warning for long queries in the log.


>
> —
> Denis
>
> > On Mar 3, 2017, at 11:01 AM, Dmitriy Setrakyan <[hidden email]>
> wrote:
> >
> > Hm... as a user I would be interested to know that, say, 95% of my
> "select
> > * from sometable where..." query executes under 10ms or so.
> >
> > I think holding some history is important and is not that hard to
> implement.
> >
> > D.
> >
> > On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[hidden email]> wrote:
> >
> >> Sergey, agree, good point!
> >>
> >> Igniters, any other thoughts before we wrap up the discussion updating
> the
> >> ticket content?
> >>
> >> —
> >> Denis
> >>
> >>> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
> >> [hidden email]> wrote:
> >>>
> >>> Sergey, that's great idea! Generally, user is not interested much in
> some
> >>> average numbers, especially in case of SQL queries. What they need is a
> >>> list of slow queries and detailed information about the execution flow
> of
> >>> these particular queries.
> >>>
> >>> -Val
> >>>
> >>> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]>
> >> wrote:
> >>>
> >>>> One more comment:
> >>>>
> >>>> In general the customer is interested in slow queries details thus we
> >> can
> >>>> introduce an option which will allow to store only queries executed
> more
> >>>> than NNN seconds. It may significantly reduce the the memory
> consumption
> >>>> for history (but logging of all queries is still available if set that
> >>>> option to 0).
> >>>>
> >>>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]>
> wrote:
> >>>>
> >>>>> Vovan,
> >>>>>
> >>>>> When I’m speaking of JOIN metrics I’m simply assume that we need to
> add
> >>>>> metrics relevant for queries with joins, metrics that will help us
> get
> >>>> more
> >>>>> insights on non-collocated and collocated joins execution flow.
> >>>>>
> >>>>>> 1) Query exec count
> >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>>>
> >>>>> Total query execution time might not be helpful in the trickiest
> cases.
> >>>>> What if you have multiple joins in your query? How do I know which
> one
> >>>>> contributes to the execution most?
> >>>>>
> >>>>> So, I would do a breakdown having total time, map time, per-join
> time,
> >>>>> reduce time. Hope it’s possible. If it’s unclear how to support
> >>>> everything
> >>>>> at the first place then it’s a different question. Let’s create
> several
> >>>>> tickets and start implementing everything gracefully.
> >>>>>
> >>>>>> 3) Number of bytes exchanged between nodes during query execution
> >>>>>
> >>>>> It will be really helpful to make a breakdown showing a number of
> bytes
> >>>>> exchanged per-join (physical join). Again, if you believe it makes
> >> sense
> >>>> to
> >>>>> do the breakdown later then let’s create an additional ticket for
> the 2
> >>>>> tier metrics then.
> >>>>>
> >>>>>> 4) Number of returned rows - min, max, avg
> >>>>>
> >>>>>
> >>>>> Plus, let’s add the following to the list:
> >>>>>
> >>>>> 5) Collocated: yes or no
> >>>>>
> >>>>> —
> >>>>> Denis
> >>>>>
> >>>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
> >>>>> wrote:
> >>>>>>
> >>>>>> Denis,
> >>>>>>
> >>>>>> The main problem with suggested metrics is that they implies that
> >>>> ceratin
> >>>>>> internal mechanics work in predefined way. For example, what is JOIN
> >>>>>> metrics? There are no guarantees that JOIN in user's query will be
> >>>>>> translated to a real physical join. What if several different query
> >>>>>> execution pieces happen in parallel? What if we rework our
> distributed
> >>>>>> query engine from pull to push approach for performance reasons and
> >>>> there
> >>>>>> will be no JOINs in classical sense?
> >>>>>>
> >>>>>> This is why I think that we should start with very basic things.
> >>>>> Something
> >>>>>> like:
> >>>>>> 1) Query exec count
> >>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
> >>>>>> 3) Number of bytes exchanged between nodes during query execution
> >>>>>> 4) Number of returned rows - min, max, avg
> >>>>>>
> >>>>>> Once we have base numbers in place, we can think of carefully
> >>>> integrating
> >>>>>> and enhancing all pieces of query execution into more verbose
> formats,
> >>>>>> similar to query plans with relative weights in classical RDBMS
> >>>> systems.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
> >>>> wrote:
> >>>>>>
> >>>>>>> Vovan,
> >>>>>>>
> >>>>>>> Your metrics make perfect sense to me. However, I see a high demand
> >>>> for
> >>>>>>> JOINs based metrics especially from those who give a try to
> >>>>> non-collocated
> >>>>>>> joins in production  and want to measure them somehow. This is why,
> >>>>>>> personally, I prefer to see the metrics below in the top priority
> >> list
> >>>>> as
> >>>>>>> well:
> >>>>>>>
> >>>>>>> if a query was executed in the collocated or non-collocated mode.
> >>>> Three
> >>>>>>> results are valid: collocated, non-collocated, simple query (no
> >>>> joins).
> >>>>>>> non-collocated query: size of the data exchanged between the nodes
> to
> >>>>>>> complete a specific join. If there are multiple joins in the query
> we
> >>>>> need
> >>>>>>> to provide this metric for every of them.
> >>>>>>> non-collocated and collocated query: a part of the time spent
> joining
> >>>>> the
> >>>>>>> data. If there are multiple joins in the query we need to provide
> >> this
> >>>>>>> metric for every of them.
> >>>>>>>
> >>>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
> >>>>>>>
> >>>>>>> In any case, can we include timing information (map phase, reduce
> >>>> phase,
> >>>>>>> join phase) into an execution plan produced by H2? Are there any
> >>>>>>> implementation hooks?
> >>>>>>>
> >>>>>>> —
> >>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
> >>>> [hidden email]>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> I think some of the metrics specified by Denis also make sense,
> so I
> >>>>>>> would
> >>>>>>>> add them as well. See below...
> >>>>>>>>
> >>>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
> >>>> [hidden email]
> >>>>>>> <mailto:[hidden email]>>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Denis,
> >>>>>>>>>
> >>>>>>>>> Query execution is complex process involving different stages
> which
> >>>>> are
> >>>>>>> not
> >>>>>>>>> very easy to match with each other. Especially provided that any
> >>>> node
> >>>>>>> can
> >>>>>>>>> leave topology at any time. Another problem is that engine
> evolves
> >>>> and
> >>>>>>>>> metrics like "did a query do broadcast or unicast" may easily
> >> become
> >>>>>>>>> useless at some point, because for example there will be neither
> >>>>>>> unicast,
> >>>>>>>>> nor broadast, but something different. On the other hand I
> >>>> completely
> >>>>>>> agree
> >>>>>>>>> that performance monitoring is essential part of any mature DBMS.
> >>>>>>>>>
> >>>>>>>>> I would start with metrics which are both very basic and easy to
> >>>>>>> implement
> >>>>>>>>> at the same time. For example we can add fingerprint (hash) to
> >> every
> >>>>>>> query
> >>>>>>>>> which will be used to join "map" and "reduce" parts with each
> other
> >>>>> and
> >>>>>>> add
> >>>>>>>>> the following basic metrics:
> >>>>>>>>> 1) Execution count for particular query
> >>>>>>>>> 2) Number of map nodes - min, max, avg
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> (1) and (2) makes sense
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> 3) Map step duration (if applicable) - min, max,
> >>>>>>>>
> >>>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Not sure if (3) and (4) are needed. I would only add them if they
> >> are
> >>>>>>> easy
> >>>>>>>> to implement.
> >>>>>>>>
> >>>>>>>> I would also add these:
> >>>>>>>>
> >>>>>>>> 5) Collocated: yes/no
> >>>>>>>> 6) last execution time
> >>>>>>>> 7) min/max/average execution duration
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Once done users will be able to get statistics for particular
> >>>> queries.
> >>>>>>>>>
> >>>>>>>>> Vladimir.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
> >>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> BTW,
> >>>>>>>>>>
> >>>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
> >>>>> ANALYZE?
> >>>>>>>>>> Sergi, is this feasible?
> >>>>>>>>>>
> >>>>>>>>>> —
> >>>>>>>>>> Denis
> >>>>>>>>>>
> >>>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
> >>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Igniters,
> >>>>>>>>>>>
> >>>>>>>>>>> Let’s shed more light on SQL query execution internals
> >>>> introducing a
> >>>>>>>>> set
> >>>>>>>>>> of useful metrics (https://issues.apache.org/
> >>>> jira/browse/IGNITE-4757
> >>>>> ).
> >>>>>>>>>>>
> >>>>>>>>>>> Per-query metrics. Total history size is defined by
> >>>>>>>>> *CacheConfiguration.
> >>>>>>>>>> getQueryDetailMetricsSize*:
> >>>>>>>>>>> * if a query was executed in the collocated or non-collocated
> >>>> mode.
> >>>>>>>>>> Three results are valid: collocated, non-collocated, simple
> query
> >>>> (no
> >>>>>>>>>> joins).
> >>>>>>>>>>> * non-collocated query: size of the data exchanged between the
> >>>> nodes
> >>>>>>> to
> >>>>>>>>>> complete a join.
> >>>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to
> >> get
> >>>>>>> data
> >>>>>>>>>> needed to complete a join.
> >>>>>>>>>>> * non-collocated and collocated query: a part of the time spent
> >>>>>>> joining
> >>>>>>>>>> the data.
> >>>>>>>>>>>
> >>>>>>>>>>> CacheMetrics:
> >>>>>>>>>>> * an average number of executed SQL queries (collocated,
> >>>>>>>>> non-collocated,
> >>>>>>>>>> simple query (no joins)).
> >>>>>>>>>>>
> >>>>>>>>>>> Please don’t hesitate do share suggest another metrics or
> improve
> >>>>>>>>>> proposed ones.
> >>>>>>>>>>>
> >>>>>>>>>>> —
> >>>>>>>>>>> Denis
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Sergey Kozlov
> >>>> GridGain Systems
> >>>> www.gridgain.com
> >>>>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Additional SQL metrics

dmagda
Summarized the discussion updating ticket’s description:
https://issues.apache.org/jira/browse/IGNITE-4757


Denis

> On Mar 3, 2017, at 11:14 AM, Dmitriy Setrakyan <[hidden email]> wrote:
>
> On Fri, Mar 3, 2017 at 11:07 AM, Denis Magda <[hidden email]> wrote:
>
>> What you’re saying should be default behavior. Plus, we can add a special
>> parameter that will gather metrics for queries executed longer that N.
>>
>
> Agree. I thought we already have that in some form. I remember seeing a
> warning for long queries in the log.
>
>
>>
>> —
>> Denis
>>
>>> On Mar 3, 2017, at 11:01 AM, Dmitriy Setrakyan <[hidden email]>
>> wrote:
>>>
>>> Hm... as a user I would be interested to know that, say, 95% of my
>> "select
>>> * from sometable where..." query executes under 10ms or so.
>>>
>>> I think holding some history is important and is not that hard to
>> implement.
>>>
>>> D.
>>>
>>> On Fri, Mar 3, 2017 at 10:55 AM, Denis Magda <[hidden email]> wrote:
>>>
>>>> Sergey, agree, good point!
>>>>
>>>> Igniters, any other thoughts before we wrap up the discussion updating
>> the
>>>> ticket content?
>>>>
>>>> —
>>>> Denis
>>>>
>>>>> On Mar 3, 2017, at 10:06 AM, Valentin Kulichenko <
>>>> [hidden email]> wrote:
>>>>>
>>>>> Sergey, that's great idea! Generally, user is not interested much in
>> some
>>>>> average numbers, especially in case of SQL queries. What they need is a
>>>>> list of slow queries and detailed information about the execution flow
>> of
>>>>> these particular queries.
>>>>>
>>>>> -Val
>>>>>
>>>>> On Fri, Mar 3, 2017 at 2:50 AM, Sergey Kozlov <[hidden email]>
>>>> wrote:
>>>>>
>>>>>> One more comment:
>>>>>>
>>>>>> In general the customer is interested in slow queries details thus we
>>>> can
>>>>>> introduce an option which will allow to store only queries executed
>> more
>>>>>> than NNN seconds. It may significantly reduce the the memory
>> consumption
>>>>>> for history (but logging of all queries is still available if set that
>>>>>> option to 0).
>>>>>>
>>>>>> On Fri, Mar 3, 2017 at 1:00 AM, Denis Magda <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Vovan,
>>>>>>>
>>>>>>> When I’m speaking of JOIN metrics I’m simply assume that we need to
>> add
>>>>>>> metrics relevant for queries with joins, metrics that will help us
>> get
>>>>>> more
>>>>>>> insights on non-collocated and collocated joins execution flow.
>>>>>>>
>>>>>>>> 1) Query exec count
>>>>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>>>>>
>>>>>>> Total query execution time might not be helpful in the trickiest
>> cases.
>>>>>>> What if you have multiple joins in your query? How do I know which
>> one
>>>>>>> contributes to the execution most?
>>>>>>>
>>>>>>> So, I would do a breakdown having total time, map time, per-join
>> time,
>>>>>>> reduce time. Hope it’s possible. If it’s unclear how to support
>>>>>> everything
>>>>>>> at the first place then it’s a different question. Let’s create
>> several
>>>>>>> tickets and start implementing everything gracefully.
>>>>>>>
>>>>>>>> 3) Number of bytes exchanged between nodes during query execution
>>>>>>>
>>>>>>> It will be really helpful to make a breakdown showing a number of
>> bytes
>>>>>>> exchanged per-join (physical join). Again, if you believe it makes
>>>> sense
>>>>>> to
>>>>>>> do the breakdown later then let’s create an additional ticket for
>> the 2
>>>>>>> tier metrics then.
>>>>>>>
>>>>>>>> 4) Number of returned rows - min, max, avg
>>>>>>>
>>>>>>>
>>>>>>> Plus, let’s add the following to the list:
>>>>>>>
>>>>>>> 5) Collocated: yes or no
>>>>>>>
>>>>>>> —
>>>>>>> Denis
>>>>>>>
>>>>>>>> On Mar 2, 2017, at 1:32 PM, Vladimir Ozerov <[hidden email]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Denis,
>>>>>>>>
>>>>>>>> The main problem with suggested metrics is that they implies that
>>>>>> ceratin
>>>>>>>> internal mechanics work in predefined way. For example, what is JOIN
>>>>>>>> metrics? There are no guarantees that JOIN in user's query will be
>>>>>>>> translated to a real physical join. What if several different query
>>>>>>>> execution pieces happen in parallel? What if we rework our
>> distributed
>>>>>>>> query engine from pull to push approach for performance reasons and
>>>>>> there
>>>>>>>> will be no JOINs in classical sense?
>>>>>>>>
>>>>>>>> This is why I think that we should start with very basic things.
>>>>>>> Something
>>>>>>>> like:
>>>>>>>> 1) Query exec count
>>>>>>>> 2) Query exec time (first define what "time" means) - min, max, avg
>>>>>>>> 3) Number of bytes exchanged between nodes during query execution
>>>>>>>> 4) Number of returned rows - min, max, avg
>>>>>>>>
>>>>>>>> Once we have base numbers in place, we can think of carefully
>>>>>> integrating
>>>>>>>> and enhancing all pieces of query execution into more verbose
>> formats,
>>>>>>>> similar to query plans with relative weights in classical RDBMS
>>>>>> systems.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 2, 2017 at 11:31 PM, Denis Magda <[hidden email]>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Vovan,
>>>>>>>>>
>>>>>>>>> Your metrics make perfect sense to me. However, I see a high demand
>>>>>> for
>>>>>>>>> JOINs based metrics especially from those who give a try to
>>>>>>> non-collocated
>>>>>>>>> joins in production  and want to measure them somehow. This is why,
>>>>>>>>> personally, I prefer to see the metrics below in the top priority
>>>> list
>>>>>>> as
>>>>>>>>> well:
>>>>>>>>>
>>>>>>>>> if a query was executed in the collocated or non-collocated mode.
>>>>>> Three
>>>>>>>>> results are valid: collocated, non-collocated, simple query (no
>>>>>> joins).
>>>>>>>>> non-collocated query: size of the data exchanged between the nodes
>> to
>>>>>>>>> complete a specific join. If there are multiple joins in the query
>> we
>>>>>>> need
>>>>>>>>> to provide this metric for every of them.
>>>>>>>>> non-collocated and collocated query: a part of the time spent
>> joining
>>>>>>> the
>>>>>>>>> data. If there are multiple joins in the query we need to provide
>>>> this
>>>>>>>>> metric for every of them.
>>>>>>>>>
>>>>>>>>> As for “unicast” and “broadcast”, agree, let’s ignore it for now.
>>>>>>>>>
>>>>>>>>> In any case, can we include timing information (map phase, reduce
>>>>>> phase,
>>>>>>>>> join phase) into an execution plan produced by H2? Are there any
>>>>>>>>> implementation hooks?
>>>>>>>>>
>>>>>>>>> —
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> On Mar 2, 2017, at 12:02 PM, Dmitriy Setrakyan <
>>>>>> [hidden email]>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> I think some of the metrics specified by Denis also make sense,
>> so I
>>>>>>>>> would
>>>>>>>>>> add them as well. See below...
>>>>>>>>>>
>>>>>>>>>> On Thu, Mar 2, 2017 at 12:36 AM, Vladimir Ozerov <
>>>>>> [hidden email]
>>>>>>>>> <mailto:[hidden email]>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Denis,
>>>>>>>>>>>
>>>>>>>>>>> Query execution is complex process involving different stages
>> which
>>>>>>> are
>>>>>>>>> not
>>>>>>>>>>> very easy to match with each other. Especially provided that any
>>>>>> node
>>>>>>>>> can
>>>>>>>>>>> leave topology at any time. Another problem is that engine
>> evolves
>>>>>> and
>>>>>>>>>>> metrics like "did a query do broadcast or unicast" may easily
>>>> become
>>>>>>>>>>> useless at some point, because for example there will be neither
>>>>>>>>> unicast,
>>>>>>>>>>> nor broadast, but something different. On the other hand I
>>>>>> completely
>>>>>>>>> agree
>>>>>>>>>>> that performance monitoring is essential part of any mature DBMS.
>>>>>>>>>>>
>>>>>>>>>>> I would start with metrics which are both very basic and easy to
>>>>>>>>> implement
>>>>>>>>>>> at the same time. For example we can add fingerprint (hash) to
>>>> every
>>>>>>>>> query
>>>>>>>>>>> which will be used to join "map" and "reduce" parts with each
>> other
>>>>>>> and
>>>>>>>>> add
>>>>>>>>>>> the following basic metrics:
>>>>>>>>>>> 1) Execution count for particular query
>>>>>>>>>>> 2) Number of map nodes - min, max, avg
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> (1) and (2) makes sense
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> 3) Map step duration (if applicable) - min, max,
>>>>>>>>>>
>>>>>>>>>> 4) Reduce step duration (if applicable) - min, max, avg
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Not sure if (3) and (4) are needed. I would only add them if they
>>>> are
>>>>>>>>> easy
>>>>>>>>>> to implement.
>>>>>>>>>>
>>>>>>>>>> I would also add these:
>>>>>>>>>>
>>>>>>>>>> 5) Collocated: yes/no
>>>>>>>>>> 6) last execution time
>>>>>>>>>> 7) min/max/average execution duration
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Once done users will be able to get statistics for particular
>>>>>> queries.
>>>>>>>>>>>
>>>>>>>>>>> Vladimir.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Feb 28, 2017 at 3:12 AM, Denis Magda <[hidden email]>
>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> BTW,
>>>>>>>>>>>>
>>>>>>>>>>>> What if we expose per-query metrics below as a part of EXPLAIN
>>>>>>> ANALYZE?
>>>>>>>>>>>> Sergi, is this feasible?
>>>>>>>>>>>>
>>>>>>>>>>>> —
>>>>>>>>>>>> Denis
>>>>>>>>>>>>
>>>>>>>>>>>>> On Feb 27, 2017, at 2:35 PM, Denis Magda <[hidden email]>
>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Igniters,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let’s shed more light on SQL query execution internals
>>>>>> introducing a
>>>>>>>>>>> set
>>>>>>>>>>>> of useful metrics (https://issues.apache.org/
>>>>>> jira/browse/IGNITE-4757
>>>>>>> ).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Per-query metrics. Total history size is defined by
>>>>>>>>>>> *CacheConfiguration.
>>>>>>>>>>>> getQueryDetailMetricsSize*:
>>>>>>>>>>>>> * if a query was executed in the collocated or non-collocated
>>>>>> mode.
>>>>>>>>>>>> Three results are valid: collocated, non-collocated, simple
>> query
>>>>>> (no
>>>>>>>>>>>> joins).
>>>>>>>>>>>>> * non-collocated query: size of the data exchanged between the
>>>>>> nodes
>>>>>>>>> to
>>>>>>>>>>>> complete a join.
>>>>>>>>>>>>> * non-collocated query: did a query do broadcast or unicast to
>>>> get
>>>>>>>>> data
>>>>>>>>>>>> needed to complete a join.
>>>>>>>>>>>>> * non-collocated and collocated query: a part of the time spent
>>>>>>>>> joining
>>>>>>>>>>>> the data.
>>>>>>>>>>>>>
>>>>>>>>>>>>> CacheMetrics:
>>>>>>>>>>>>> * an average number of executed SQL queries (collocated,
>>>>>>>>>>> non-collocated,
>>>>>>>>>>>> simple query (no joins)).
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please don’t hesitate do share suggest another metrics or
>> improve
>>>>>>>>>>>> proposed ones.
>>>>>>>>>>>>>
>>>>>>>>>>>>> —
>>>>>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sergey Kozlov
>>>>>> GridGain Systems
>>>>>> www.gridgain.com
>>>>>>
>>>>
>>>>
>>
>>