Apache Ignite Developers - Legacy Mail Archive

Collecting query metrics and statistics

Classic

List

Threaded

4 messages Options

Vladimir Ozerov

Collecting query metrics and statistics

Igniters,

At the moment Ignite lacks administrative and management capabilities in
various places. On such demanded place is SQL queries. I want to start a
discussion around adding query better metrics to Ignite.

I propose to split the task in two parts: infrastructure and UX.

1) We start with adding unique UUID to all query requests, to be able to
glue all pieces together. Then we log the following things (approximately):
- Query ID
- Original query
- Map query
- Mapper execution time
- Mapper rows
- Mapper output bytes (to reducer)
- Mapper input bytes (for distributed joins)
- Reduce query
- Reduce execution time
- Reduce rows
- Reduce input bytes (sum of all mapper output bytes)
- Some info on distributed joins may be
- Advanced things in future (memory page accesses, disk accesses, etc.)

All these stats are saved to local structures. These structures are
accessible through some API and typically not exchange between nodes until
requested.

2) UX
The most important - we will made these stats *accessible through SQL*!

SELECT * FROM map_query_stats
|node|query_id|query |time_ms|input_bytes|
-------------------------------------------------------
|CLI1|*UUID1* |SELECT AVG(f) | 50 |100 |

SELECT * FROM reduce_query_stats
|node|query_id|query |time_ms|output_bytes|disk_reads|
-----------------------------------------------------------------------
|SRV1|*UUID1* |SELECT SUM(f), COUNT(f)|00 |20 |35 |

Then we do some UNIONS/JOINS from client/console/driver/whatever, and:
SELECT ... FROM map_query_stats WHERE query_id = *UUID1*
UNION
SELECT ... FROM reduce_query_stats WHERE query_id = *UUID1*

|total_time|total_disk_reads|reduce_node|reduce_time|reduce_disk_reads|
-----------------------------------------------------------------------
|100 |180 | | | |
-----------------------------------------------------------------------
| | |SRV1 |10 |35 |
-----------------------------------------------------------------------
| | |SRV2 |90 |130 |
-----------------------------------------------------------------------
| | |SRV3 |20 |25 |

Makes sense?

Vladimir.

Andrew Mashenkov

Re: Collecting query metrics and statistics

Vladimir,

Looks like "Must have" feature.
Most of databases I see have such info available via system tables in
schema.

We have to choose and reserve a schema name for this. E.g.
"information_schema".

On Fri, Aug 4, 2017 at 5:48 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Igniters,
>
> At the moment Ignite lacks administrative and management capabilities in
> various places. On such demanded place is SQL queries. I want to start a
> discussion around adding query better metrics to Ignite.
>
> I propose to split the task in two parts: infrastructure and UX.
>
> 1) We start with adding unique UUID to all query requests, to be able to
> glue all pieces together. Then we log the following things (approximately):
> - Query ID
> - Original query
> - Map query
> - Mapper execution time
> - Mapper rows
> - Mapper output bytes (to reducer)
> - Mapper input bytes (for distributed joins)
> - Reduce query
> - Reduce execution time
> - Reduce rows
> - Reduce input bytes (sum of all mapper output bytes)
> - Some info on distributed joins may be
> - Advanced things in future (memory page accesses, disk accesses, etc.)
>
> All these stats are saved to local structures. These structures are
> accessible through some API and typically not exchange between nodes until
> requested.
>
> 2) UX
> The most important - we will made these stats *accessible through SQL*!
>
> SELECT * FROM map_query_stats
> |node|query_id|query |time_ms|input_bytes|
> -------------------------------------------------------
> |CLI1|*UUID1* |SELECT AVG(f) | 50 |100 |
>
> SELECT * FROM reduce_query_stats
> |node|query_id|query |time_ms|output_bytes|disk_reads|
> -----------------------------------------------------------------------
> |SRV1|*UUID1* |SELECT SUM(f), COUNT(f)|00 |20 |35 |
>
> Then we do some UNIONS/JOINS from client/console/driver/whatever, and:
> SELECT ... FROM map_query_stats WHERE query_id = *UUID1*
> UNION
> SELECT ... FROM reduce_query_stats WHERE query_id = *UUID1*
>
> |total_time|total_disk_reads|reduce_node|reduce_time|reduce_disk_reads|
> -----------------------------------------------------------------------
> |100 |180 | | | |
> -----------------------------------------------------------------------
> | | |SRV1 |10 |35 |
> -----------------------------------------------------------------------
> | | |SRV2 |90 |130 |
> -----------------------------------------------------------------------
> | | |SRV3 |20 |25 |
>
> Makes sense?
>
> Vladimir.
>

--
Best regards,
Andrey V. Mashenkov

Vladimir Ozerov

Re: Collecting query metrics and statistics

In reply to this post by Vladimir Ozerov

NB: I mixed up in places "mapper" and "reducer" in the second part of the
message.

пт, 4 авг. 2017 г. в 17:48, Vladimir Ozerov <[hidden email]>:

yzhdanov

Re: Collecting query metrics and statistics

Vladimir, I like the idea very much. We should also provide the stats via
MBeans.

--Yakov