Apache Ignite Developers - Legacy Mail Archive

Ignite diagnostic (SQL system views)

Classic

List

Threaded

16 messages Options

Alexey Plekhanov

Ignite diagnostic (SQL system views)

Hello, Igniters!

For Ignite diagnostic usually it’s helpful to get some Ignite internals
information. But currently, in my opinion, there are no convenient tools
for this purpose:

· Some issues can be solved by analyzing log files. Log files are
useful for dumps, but sometimes they are difficult to read. Also
interesting metrics can’t be received runtime by request, we need to wait
until Ignite will write these metrics by timeout or other events.

· JMX is useful for scalar metrics. Complex and table data can also
be received, but it’s difficult to read, filter and sort them without
processing by specialized external tools. For most frequently used cases
almost duplicating metrics are created to show data in an easy-to-read form.

· Web-console is able to show table and complex data. Perhaps,
someday web-console will contain all necessary dashboards for most problem
investigation, but some non-trivial queries will not be covered anyway.
Also web-console needs additional infrastructure to work.

· External “home-made” tools can be used for non-trivial cases. They
cover highly specialized cases and usually can’t be used as general purpose
tools.

Sometimes we are forced to use more than one tool and join data by hands
(for example, current thread dump and data from logs).

Often RDBMS for diagnostic purposes provides system views (for example,
DBA_% and V$% in Oracle), which can be queried by SQL. This solution makes
all internal diagnostic information available in a readable form (with all
possible filters and projections) without using any other internal or
external tools. My proposal is to create similar system views in Ignite.

I implement working prototype (PR: [1]). It contains views:

IGNITE_SYSTEM_VIEWS

Registered system views

IGNITE_INSTANCE

Ignite instance

IGNITE_JVM_THREADS

JVM threads

IGNITE_JVM_RUNTIME

JVM runtime

IGNITE_JVM_OS

JVM operating system

IGNITE_CACHES

Ignite caches

IGNITE_CACHE_CLUSTER_METRICS

Ignite cache cluster metrics

IGNITE_CACHE_NODE_METRICS

Ignite cache node metrics

IGNITE_CACHE_GROUPS

Cache groups

IGNITE_NODES

Nodes in topology

IGNITE_NODE_HOSTS

Node hosts

IGNITE_NODE_ADDRESSES

Node addresses

IGNITE_NODE_ATTRIBUTES

Node attributes

IGNITE_NODE_METRICS

Node metrics

IGNITE_TRANSACTIONS

Active transactions

IGNITE_TRANSACTION_ENTRIES

Cache entries used by transaction

IGNITE_TASKS

Active tasks

IGNITE_PART_ASSIGNMENT

Partition assignment map

IGNITE_PART_ALLOCATION

Partition allocation map

There are much more useful views can be implemented (executors diagnostic,
SPIs diagnostic, etc).

Some usage examples:

Cache groups and their partitions, which used by transaction more than 5
minutes long:

SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS ENTITIES_CNT
FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = te.XID
JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION

Average CPU load on server nodes grouped by operating system:

SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
FROM INFORMATION_SCHEMA.IGNITE_NODES n
JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID AND
na.NAME = 'os.name'
JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
WHERE n.IS_CLIENT = false
GROUP BY na.VALUE

Top 5 nodes by puts to cache ‘cache’:

SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
WHERE cm.CACHE_NAME = 'cache'
ORDER BY cm.CACHE_PUTS DESC
LIMIT 5

Does this implementation interesting to someone else? Maybe any views are
redundant? Which additional first-priority views must be implemented? Any
other thoughts or proposal?

[1] https://github.com/apache/ignite/pull/3413

Vladimir Ozerov

Re: Ignite diagnostic (SQL system views)

Hi Alex,

System views could be extremely valuable addition for Ignite. Ideally, user
should be able to monitor and manage state of the whole cluster with a
single SQL command line. We have plans to implement it for a very long
time. However, this is very sensitive task which should take a lot of
moving pieces in count, such as usability, consistency, performance,
security, etc..

Let me point several major concerns I see at the moment:

1) Usability: INFORMATION_SCHEMA
This schema is part of SQL ANSI standard. When creating system views, some
vendors prefer to store them in completely different predefined schema
(Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
directly. Both approaches could work. However, the latter breaks separation
of concerns - we store typical metadata near to possibly sensitive system
data. Also it makes security management more complex - system data is very
sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA to
user. Instead, we have to grant that access on per-view basis. For this
reason my preference is to store system tables in separate schema, not in
INFORMATION_SCHEMA

2) Consistency: local data
One of implemented view GridH2SysViewImplInstance. Normally SQL users
communicate with Ignite through JDBC/ODBC drivers. These drivers are
connected to a single node, typically client node. Moreover, we will
introduce high-availability feature when drivers were able to connect to
any address from a predefined list. It renders this view useless, as you do
not know which node you connected to. Also, local-only data cannot be
joined in general case - you will receive different results on different
nodes. The same goes for transactions, JVM info, etc.

3) Performance
Suppose we fixed consistency of transactions and now this view shows
transactions in the whole cluster with possibility to filter them by nodes
- this is what user would expect out of the box. Another problem appears
then - performance. How would we collect necessary data? How would we
handle joins, when particular view could be scanned multiple times during
query execution? How we achieve sensible consistency? Most probably we
would collect remote data once when query is started, cache it somehow on
query session level, and then re-use during joins. But again, this should
be discussed separately.

4) Security: JVM info
We should define clear boundaries of what info is exposed. JVM data along
with running threads is critically sensitive information. We should not
expose it until we have authorization capabilities.

In order to start moving this code from prototype to production state we
should start with the most simple and consistent views. E.g. IGNITE_CACHES.
Let's move it to a separate PR, review infrastructure code, review view
implementation, agree on proper naming and placement, and merge it. Then
each and every view (or group of related views) should be discussed and
reviewed separately.

As far as node-local stuff, may be we should move it to a separate schema,
or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all transactions
in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local
node. In this case we will be able to merge "local" stuff shortly, and
implement more complex but at the same time much more useful distributed
stuff later on.

Makes sense?

Vladimir.

On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <[hidden email]>
wrote:

> Hello, Igniters!
>
> For Ignite diagnostic usually it’s helpful to get some Ignite internals
> information. But currently, in my opinion, there are no convenient tools
> for this purpose:
>
> · Some issues can be solved by analyzing log files. Log files are
> useful for dumps, but sometimes they are difficult to read. Also
> interesting metrics can’t be received runtime by request, we need to wait
> until Ignite will write these metrics by timeout or other events.
>
> · JMX is useful for scalar metrics. Complex and table data can also
> be received, but it’s difficult to read, filter and sort them without
> processing by specialized external tools. For most frequently used cases
> almost duplicating metrics are created to show data in an easy-to-read
> form.
>
> · Web-console is able to show table and complex data. Perhaps,
> someday web-console will contain all necessary dashboards for most problem
> investigation, but some non-trivial queries will not be covered anyway.
> Also web-console needs additional infrastructure to work.
>
> · External “home-made” tools can be used for non-trivial cases. They
> cover highly specialized cases and usually can’t be used as general purpose
> tools.
>
> Sometimes we are forced to use more than one tool and join data by hands
> (for example, current thread dump and data from logs).
>
> Often RDBMS for diagnostic purposes provides system views (for example,
> DBA_% and V$% in Oracle), which can be queried by SQL. This solution makes
> all internal diagnostic information available in a readable form (with all
> possible filters and projections) without using any other internal or
> external tools. My proposal is to create similar system views in Ignite.
>
> I implement working prototype (PR: [1]). It contains views:
>
> IGNITE_SYSTEM_VIEWS
>
> Registered system views
>
> IGNITE_INSTANCE
>
> Ignite instance
>
> IGNITE_JVM_THREADS
>
> JVM threads
>
> IGNITE_JVM_RUNTIME
>
> JVM runtime
>
> IGNITE_JVM_OS
>
> JVM operating system
>
> IGNITE_CACHES
>
> Ignite caches
>
> IGNITE_CACHE_CLUSTER_METRICS
>
> Ignite cache cluster metrics
>
> IGNITE_CACHE_NODE_METRICS
>
> Ignite cache node metrics
>
> IGNITE_CACHE_GROUPS
>
> Cache groups
>
> IGNITE_NODES
>
> Nodes in topology
>
> IGNITE_NODE_HOSTS
>
> Node hosts
>
> IGNITE_NODE_ADDRESSES
>
> Node addresses
>
> IGNITE_NODE_ATTRIBUTES
>
> Node attributes
>
> IGNITE_NODE_METRICS
>
> Node metrics
>
> IGNITE_TRANSACTIONS
>
> Active transactions
>
> IGNITE_TRANSACTION_ENTRIES
>
> Cache entries used by transaction
>
> IGNITE_TASKS
>
> Active tasks
>
> IGNITE_PART_ASSIGNMENT
>
> Partition assignment map
>
> IGNITE_PART_ALLOCATION
>
> Partition allocation map
>
>
>
> There are much more useful views can be implemented (executors diagnostic,
> SPIs diagnostic, etc).
>
> Some usage examples:
>
> Cache groups and their partitions, which used by transaction more than 5
> minutes long:
>
> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS ENTITIES_CNT
> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = te.XID
> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
>
>
>
> Average CPU load on server nodes grouped by operating system:
>
> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID AND
> na.NAME = 'os.name'
> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
> WHERE n.IS_CLIENT = false
> GROUP BY na.VALUE
>
>
>
> Top 5 nodes by puts to cache ‘cache’:
>
> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> WHERE cm.CACHE_NAME = 'cache'
> ORDER BY cm.CACHE_PUTS DESC
> LIMIT 5
>
>
>
> Does this implementation interesting to someone else? Maybe any views are
> redundant? Which additional first-priority views must be implemented? Any
> other thoughts or proposal?
>
> [1] https://github.com/apache/ignite/pull/3413
>

Alexey Plekhanov

Re: Ignite diagnostic (SQL system views)

1) It’s not a principal point, I can change schema. The INFORMATION_SCHEMA
was used because it’s already exists and usually used for metadata tables
and views. Your proposal is to use schema “IGNITE”, am I understand you
right? BTW, for now, we can’t query another (H2) meta tables from the
INFORMATION_SCHEMA, so, “Ignite system views” is only available views to
query from this schema.
2) Exactly for this reason the IGNITE_INSTANCE view is useful: to determine
which node we are connected to.
3) As the first phase, in my opinion, local views will be enough.
Performance and caching of distributed views should be discussed at next
phases, when distributed views implementation will be planned. In current
implementation I tried to use indexing for local views wherever it’s
possible.
4) I don’t think, that JVM info is more critical information than, for
example, caches or nodes information. When authorization capabilities
planned to implement?

About local data: yes, we can rename all currently implemented views for
the local node data as LOCAL_..., and create (someday) new whole cluster
views (which use distributed requests) without prefix or, for example, with
CLUSTER_ prefix. But some views can show all cluster information using only
local node data, without distributed requests (for example
IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
IGNITE_NODES, etc). Are they local or cluster views in this concept? Which
prefix should be used? And what about caches? Are they local or cluster? On
local node we can see cluster wide caches (replicated and distributed) and
caches for current node only. Local caches list may differ from node to
node. Which prefix should be used for this view? And one more, there is no
sense for some views to make them cluster wide (for example
INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
INSTANCE view?

So, next steps: split PR, change schema name (IGNITE?), change view name
for caches (CACHES, LOCAL_CACHES?)

2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Hi Alex,
>
> System views could be extremely valuable addition for Ignite. Ideally, user
> should be able to monitor and manage state of the whole cluster with a
> single SQL command line. We have plans to implement it for a very long
> time. However, this is very sensitive task which should take a lot of
> moving pieces in count, such as usability, consistency, performance,
> security, etc..
>
> Let me point several major concerns I see at the moment:
>
> 1) Usability: INFORMATION_SCHEMA
> This schema is part of SQL ANSI standard. When creating system views, some
> vendors prefer to store them in completely different predefined schema
> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> directly. Both approaches could work. However, the latter breaks separation
> of concerns - we store typical metadata near to possibly sensitive system
> data. Also it makes security management more complex - system data is very
> sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA to
> user. Instead, we have to grant that access on per-view basis. For this
> reason my preference is to store system tables in separate schema, not in
> INFORMATION_SCHEMA
>
> 2) Consistency: local data
> One of implemented view GridH2SysViewImplInstance. Normally SQL users
> communicate with Ignite through JDBC/ODBC drivers. These drivers are
> connected to a single node, typically client node. Moreover, we will
> introduce high-availability feature when drivers were able to connect to
> any address from a predefined list. It renders this view useless, as you do
> not know which node you connected to. Also, local-only data cannot be
> joined in general case - you will receive different results on different
> nodes. The same goes for transactions, JVM info, etc.
>
> 3) Performance
> Suppose we fixed consistency of transactions and now this view shows
> transactions in the whole cluster with possibility to filter them by nodes
> - this is what user would expect out of the box. Another problem appears
> then - performance. How would we collect necessary data? How would we
> handle joins, when particular view could be scanned multiple times during
> query execution? How we achieve sensible consistency? Most probably we
> would collect remote data once when query is started, cache it somehow on
> query session level, and then re-use during joins. But again, this should
> be discussed separately.
>
> 4) Security: JVM info
> We should define clear boundaries of what info is exposed. JVM data along
> with running threads is critically sensitive information. We should not
> expose it until we have authorization capabilities.
>
> In order to start moving this code from prototype to production state we
> should start with the most simple and consistent views. E.g. IGNITE_CACHES.
> Let's move it to a separate PR, review infrastructure code, review view
> implementation, agree on proper naming and placement, and merge it. Then
> each and every view (or group of related views) should be discussed and
> reviewed separately.
>
> As far as node-local stuff, may be we should move it to a separate schema,
> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all transactions
> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local
> node. In this case we will be able to merge "local" stuff shortly, and
> implement more complex but at the same time much more useful distributed
> stuff later on.
>
> Makes sense?
>
> Vladimir.
>
>
> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <[hidden email]>
> wrote:
>
> > Hello, Igniters!
> >
> > For Ignite diagnostic usually it’s helpful to get some Ignite internals
> > information. But currently, in my opinion, there are no convenient tools
> > for this purpose:
> >
> > · Some issues can be solved by analyzing log files. Log files are
> > useful for dumps, but sometimes they are difficult to read. Also
> > interesting metrics can’t be received runtime by request, we need to wait
> > until Ignite will write these metrics by timeout or other events.
> >
> > · JMX is useful for scalar metrics. Complex and table data can
> also
> > be received, but it’s difficult to read, filter and sort them without
> > processing by specialized external tools. For most frequently used cases
> > almost duplicating metrics are created to show data in an easy-to-read
> > form.
> >
> > · Web-console is able to show table and complex data. Perhaps,
> > someday web-console will contain all necessary dashboards for most
> problem
> > investigation, but some non-trivial queries will not be covered anyway.
> > Also web-console needs additional infrastructure to work.
> >
> > · External “home-made” tools can be used for non-trivial cases.
> They
> > cover highly specialized cases and usually can’t be used as general
> purpose
> > tools.
> >
> > Sometimes we are forced to use more than one tool and join data by hands
> > (for example, current thread dump and data from logs).
> >
> > Often RDBMS for diagnostic purposes provides system views (for example,
> > DBA_% and V$% in Oracle), which can be queried by SQL. This solution
> makes
> > all internal diagnostic information available in a readable form (with
> all
> > possible filters and projections) without using any other internal or
> > external tools. My proposal is to create similar system views in Ignite.
> >
> > I implement working prototype (PR: [1]). It contains views:
> >
> > IGNITE_SYSTEM_VIEWS
> >
> > Registered system views
> >
> > IGNITE_INSTANCE
> >
> > Ignite instance
> >
> > IGNITE_JVM_THREADS
> >
> > JVM threads
> >
> > IGNITE_JVM_RUNTIME
> >
> > JVM runtime
> >
> > IGNITE_JVM_OS
> >
> > JVM operating system
> >
> > IGNITE_CACHES
> >
> > Ignite caches
> >
> > IGNITE_CACHE_CLUSTER_METRICS
> >
> > Ignite cache cluster metrics
> >
> > IGNITE_CACHE_NODE_METRICS
> >
> > Ignite cache node metrics
> >
> > IGNITE_CACHE_GROUPS
> >
> > Cache groups
> >
> > IGNITE_NODES
> >
> > Nodes in topology
> >
> > IGNITE_NODE_HOSTS
> >
> > Node hosts
> >
> > IGNITE_NODE_ADDRESSES
> >
> > Node addresses
> >
> > IGNITE_NODE_ATTRIBUTES
> >
> > Node attributes
> >
> > IGNITE_NODE_METRICS
> >
> > Node metrics
> >
> > IGNITE_TRANSACTIONS
> >
> > Active transactions
> >
> > IGNITE_TRANSACTION_ENTRIES
> >
> > Cache entries used by transaction
> >
> > IGNITE_TASKS
> >
> > Active tasks
> >
> > IGNITE_PART_ASSIGNMENT
> >
> > Partition assignment map
> >
> > IGNITE_PART_ALLOCATION
> >
> > Partition allocation map
> >
> >
> >
> > There are much more useful views can be implemented (executors
> diagnostic,
> > SPIs diagnostic, etc).
> >
> > Some usage examples:
> >
> > Cache groups and their partitions, which used by transaction more than 5
> > minutes long:
> >
> > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS ENTITIES_CNT
> > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID = te.XID
> > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
> > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> >
> >
> >
> > Average CPU load on server nodes grouped by operating system:
> >
> > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID
> AND
> > na.NAME = 'os.name'
> > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
> > WHERE n.IS_CLIENT = false
> > GROUP BY na.VALUE
> >
> >
> >
> > Top 5 nodes by puts to cache ‘cache’:
> >
> > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > WHERE cm.CACHE_NAME = 'cache'
> > ORDER BY cm.CACHE_PUTS DESC
> > LIMIT 5
> >
> >
> >
> > Does this implementation interesting to someone else? Maybe any views are
> > redundant? Which additional first-priority views must be implemented? Any
> > other thoughts or proposal?
> >
> > [1] https://github.com/apache/ignite/pull/3413
> >
>

Vladimir Ozerov

Re: Ignite diagnostic (SQL system views)

Let's start with a single and the most simple view, e.g.
LOCAL_TRANSACTIONS. We will review and merge it along with necessary
infrastructure. Then will handle the rest view in separate tickets and
separate focused discussions.

On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <[hidden email]>
wrote:

> 1) It’s not a principal point, I can change schema. The INFORMATION_SCHEMA
> was used because it’s already exists and usually used for metadata tables
> and views. Your proposal is to use schema “IGNITE”, am I understand you
> right? BTW, for now, we can’t query another (H2) meta tables from the
> INFORMATION_SCHEMA, so, “Ignite system views” is only available views to
> query from this schema.
> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to determine
> which node we are connected to.
> 3) As the first phase, in my opinion, local views will be enough.
> Performance and caching of distributed views should be discussed at next
> phases, when distributed views implementation will be planned. In current
> implementation I tried to use indexing for local views wherever it’s
> possible.
> 4) I don’t think, that JVM info is more critical information than, for
> example, caches or nodes information. When authorization capabilities
> planned to implement?
>
> About local data: yes, we can rename all currently implemented views for
> the local node data as LOCAL_..., and create (someday) new whole cluster
> views (which use distributed requests) without prefix or, for example, with
> CLUSTER_ prefix. But some views can show all cluster information using only
> local node data, without distributed requests (for example
> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
> IGNITE_NODES, etc). Are they local or cluster views in this concept? Which
> prefix should be used? And what about caches? Are they local or cluster? On
> local node we can see cluster wide caches (replicated and distributed) and
> caches for current node only. Local caches list may differ from node to
> node. Which prefix should be used for this view? And one more, there is no
> sense for some views to make them cluster wide (for example
> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
> INSTANCE view?
>
> So, next steps: split PR, change schema name (IGNITE?), change view name
> for caches (CACHES, LOCAL_CACHES?)
>
>
> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Hi Alex,
> >
> > System views could be extremely valuable addition for Ignite. Ideally,
> user
> > should be able to monitor and manage state of the whole cluster with a
> > single SQL command line. We have plans to implement it for a very long
> > time. However, this is very sensitive task which should take a lot of
> > moving pieces in count, such as usability, consistency, performance,
> > security, etc..
> >
> > Let me point several major concerns I see at the moment:
> >
> > 1) Usability: INFORMATION_SCHEMA
> > This schema is part of SQL ANSI standard. When creating system views,
> some
> > vendors prefer to store them in completely different predefined schema
> > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> > directly. Both approaches could work. However, the latter breaks
> separation
> > of concerns - we store typical metadata near to possibly sensitive system
> > data. Also it makes security management more complex - system data is
> very
> > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA to
> > user. Instead, we have to grant that access on per-view basis. For this
> > reason my preference is to store system tables in separate schema, not in
> > INFORMATION_SCHEMA
> >
> > 2) Consistency: local data
> > One of implemented view GridH2SysViewImplInstance. Normally SQL users
> > communicate with Ignite through JDBC/ODBC drivers. These drivers are
> > connected to a single node, typically client node. Moreover, we will
> > introduce high-availability feature when drivers were able to connect to
> > any address from a predefined list. It renders this view useless, as you
> do
> > not know which node you connected to. Also, local-only data cannot be
> > joined in general case - you will receive different results on different
> > nodes. The same goes for transactions, JVM info, etc.
> >
> > 3) Performance
> > Suppose we fixed consistency of transactions and now this view shows
> > transactions in the whole cluster with possibility to filter them by
> nodes
> > - this is what user would expect out of the box. Another problem appears
> > then - performance. How would we collect necessary data? How would we
> > handle joins, when particular view could be scanned multiple times during
> > query execution? How we achieve sensible consistency? Most probably we
> > would collect remote data once when query is started, cache it somehow on
> > query session level, and then re-use during joins. But again, this should
> > be discussed separately.
> >
> > 4) Security: JVM info
> > We should define clear boundaries of what info is exposed. JVM data along
> > with running threads is critically sensitive information. We should not
> > expose it until we have authorization capabilities.
> >
> > In order to start moving this code from prototype to production state we
> > should start with the most simple and consistent views. E.g.
> IGNITE_CACHES.
> > Let's move it to a separate PR, review infrastructure code, review view
> > implementation, agree on proper naming and placement, and merge it. Then
> > each and every view (or group of related views) should be discussed and
> > reviewed separately.
> >
> > As far as node-local stuff, may be we should move it to a separate
> schema,
> > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> transactions
> > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local
> > node. In this case we will be able to merge "local" stuff shortly, and
> > implement more complex but at the same time much more useful distributed
> > stuff later on.
> >
> > Makes sense?
> >
> > Vladimir.
> >
> >
> > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <[hidden email]>
> > wrote:
> >
> > > Hello, Igniters!
> > >
> > > For Ignite diagnostic usually it’s helpful to get some Ignite internals
> > > information. But currently, in my opinion, there are no convenient
> tools
> > > for this purpose:
> > >
> > > · Some issues can be solved by analyzing log files. Log files
> are
> > > useful for dumps, but sometimes they are difficult to read. Also
> > > interesting metrics can’t be received runtime by request, we need to
> wait
> > > until Ignite will write these metrics by timeout or other events.
> > >
> > > · JMX is useful for scalar metrics. Complex and table data can
> > also
> > > be received, but it’s difficult to read, filter and sort them without
> > > processing by specialized external tools. For most frequently used
> cases
> > > almost duplicating metrics are created to show data in an easy-to-read
> > > form.
> > >
> > > · Web-console is able to show table and complex data. Perhaps,
> > > someday web-console will contain all necessary dashboards for most
> > problem
> > > investigation, but some non-trivial queries will not be covered anyway.
> > > Also web-console needs additional infrastructure to work.
> > >
> > > · External “home-made” tools can be used for non-trivial cases.
> > They
> > > cover highly specialized cases and usually can’t be used as general
> > purpose
> > > tools.
> > >
> > > Sometimes we are forced to use more than one tool and join data by
> hands
> > > (for example, current thread dump and data from logs).
> > >
> > > Often RDBMS for diagnostic purposes provides system views (for example,
> > > DBA_% and V$% in Oracle), which can be queried by SQL. This solution
> > makes
> > > all internal diagnostic information available in a readable form (with
> > all
> > > possible filters and projections) without using any other internal or
> > > external tools. My proposal is to create similar system views in
> Ignite.
> > >
> > > I implement working prototype (PR: [1]). It contains views:
> > >
> > > IGNITE_SYSTEM_VIEWS
> > >
> > > Registered system views
> > >
> > > IGNITE_INSTANCE
> > >
> > > Ignite instance
> > >
> > > IGNITE_JVM_THREADS
> > >
> > > JVM threads
> > >
> > > IGNITE_JVM_RUNTIME
> > >
> > > JVM runtime
> > >
> > > IGNITE_JVM_OS
> > >
> > > JVM operating system
> > >
> > > IGNITE_CACHES
> > >
> > > Ignite caches
> > >
> > > IGNITE_CACHE_CLUSTER_METRICS
> > >
> > > Ignite cache cluster metrics
> > >
> > > IGNITE_CACHE_NODE_METRICS
> > >
> > > Ignite cache node metrics
> > >
> > > IGNITE_CACHE_GROUPS
> > >
> > > Cache groups
> > >
> > > IGNITE_NODES
> > >
> > > Nodes in topology
> > >
> > > IGNITE_NODE_HOSTS
> > >
> > > Node hosts
> > >
> > > IGNITE_NODE_ADDRESSES
> > >
> > > Node addresses
> > >
> > > IGNITE_NODE_ATTRIBUTES
> > >
> > > Node attributes
> > >
> > > IGNITE_NODE_METRICS
> > >
> > > Node metrics
> > >
> > > IGNITE_TRANSACTIONS
> > >
> > > Active transactions
> > >
> > > IGNITE_TRANSACTION_ENTRIES
> > >
> > > Cache entries used by transaction
> > >
> > > IGNITE_TASKS
> > >
> > > Active tasks
> > >
> > > IGNITE_PART_ASSIGNMENT
> > >
> > > Partition assignment map
> > >
> > > IGNITE_PART_ALLOCATION
> > >
> > > Partition allocation map
> > >
> > >
> > >
> > > There are much more useful views can be implemented (executors
> > diagnostic,
> > > SPIs diagnostic, etc).
> > >
> > > Some usage examples:
> > >
> > > Cache groups and their partitions, which used by transaction more than
> 5
> > > minutes long:
> > >
> > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> ENTITIES_CNT
> > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> te.XID
> > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
> > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > >
> > >
> > >
> > > Average CPU load on server nodes grouped by operating system:
> > >
> > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID = n.ID
> > AND
> > > na.NAME = 'os.name'
> > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
> > > WHERE n.IS_CLIENT = false
> > > GROUP BY na.VALUE
> > >
> > >
> > >
> > > Top 5 nodes by puts to cache ‘cache’:
> > >
> > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > WHERE cm.CACHE_NAME = 'cache'
> > > ORDER BY cm.CACHE_PUTS DESC
> > > LIMIT 5
> > >
> > >
> > >
> > > Does this implementation interesting to someone else? Maybe any views
> are
> > > redundant? Which additional first-priority views must be implemented?
> Any
> > > other thoughts or proposal?
> > >
> > > [1] https://github.com/apache/ignite/pull/3413
> > >
> >
>

Anton Vinogradov

Re: Ignite diagnostic (SQL system views)

I've created IEP-13 [1] to cover all cases.
Feel free to create issues.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769

On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <[hidden email]>
wrote:

> Let's start with a single and the most simple view, e.g.
> LOCAL_TRANSACTIONS. We will review and merge it along with necessary
> infrastructure. Then will handle the rest view in separate tickets and
> separate focused discussions.
>
> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <[hidden email]>
> wrote:
>
> > 1) It’s not a principal point, I can change schema. The
> INFORMATION_SCHEMA
> > was used because it’s already exists and usually used for metadata tables
> > and views. Your proposal is to use schema “IGNITE”, am I understand you
> > right? BTW, for now, we can’t query another (H2) meta tables from the
> > INFORMATION_SCHEMA, so, “Ignite system views” is only available views to
> > query from this schema.
> > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> determine
> > which node we are connected to.
> > 3) As the first phase, in my opinion, local views will be enough.
> > Performance and caching of distributed views should be discussed at next
> > phases, when distributed views implementation will be planned. In current
> > implementation I tried to use indexing for local views wherever it’s
> > possible.
> > 4) I don’t think, that JVM info is more critical information than, for
> > example, caches or nodes information. When authorization capabilities
> > planned to implement?
> >
> > About local data: yes, we can rename all currently implemented views for
> > the local node data as LOCAL_..., and create (someday) new whole cluster
> > views (which use distributed requests) without prefix or, for example,
> with
> > CLUSTER_ prefix. But some views can show all cluster information using
> only
> > local node data, without distributed requests (for example
> > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
> > IGNITE_NODES, etc). Are they local or cluster views in this concept?
> Which
> > prefix should be used? And what about caches? Are they local or cluster?
> On
> > local node we can see cluster wide caches (replicated and distributed)
> and
> > caches for current node only. Local caches list may differ from node to
> > node. Which prefix should be used for this view? And one more, there is
> no
> > sense for some views to make them cluster wide (for example
> > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
> > INSTANCE view?
> >
> > So, next steps: split PR, change schema name (IGNITE?), change view name
> > for caches (CACHES, LOCAL_CACHES?)
> >
> >
> > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> >
> > > Hi Alex,
> > >
> > > System views could be extremely valuable addition for Ignite. Ideally,
> > user
> > > should be able to monitor and manage state of the whole cluster with a
> > > single SQL command line. We have plans to implement it for a very long
> > > time. However, this is very sensitive task which should take a lot of
> > > moving pieces in count, such as usability, consistency, performance,
> > > security, etc..
> > >
> > > Let me point several major concerns I see at the moment:
> > >
> > > 1) Usability: INFORMATION_SCHEMA
> > > This schema is part of SQL ANSI standard. When creating system views,
> > some
> > > vendors prefer to store them in completely different predefined schema
> > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> > > directly. Both approaches could work. However, the latter breaks
> > separation
> > > of concerns - we store typical metadata near to possibly sensitive
> system
> > > data. Also it makes security management more complex - system data is
> > very
> > > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA
> to
> > > user. Instead, we have to grant that access on per-view basis. For this
> > > reason my preference is to store system tables in separate schema, not
> in
> > > INFORMATION_SCHEMA
> > >
> > > 2) Consistency: local data
> > > One of implemented view GridH2SysViewImplInstance. Normally SQL users
> > > communicate with Ignite through JDBC/ODBC drivers. These drivers are
> > > connected to a single node, typically client node. Moreover, we will
> > > introduce high-availability feature when drivers were able to connect
> to
> > > any address from a predefined list. It renders this view useless, as
> you
> > do
> > > not know which node you connected to. Also, local-only data cannot be
> > > joined in general case - you will receive different results on
> different
> > > nodes. The same goes for transactions, JVM info, etc.
> > >
> > > 3) Performance
> > > Suppose we fixed consistency of transactions and now this view shows
> > > transactions in the whole cluster with possibility to filter them by
> > nodes
> > > - this is what user would expect out of the box. Another problem
> appears
> > > then - performance. How would we collect necessary data? How would we
> > > handle joins, when particular view could be scanned multiple times
> during
> > > query execution? How we achieve sensible consistency? Most probably we
> > > would collect remote data once when query is started, cache it somehow
> on
> > > query session level, and then re-use during joins. But again, this
> should
> > > be discussed separately.
> > >
> > > 4) Security: JVM info
> > > We should define clear boundaries of what info is exposed. JVM data
> along
> > > with running threads is critically sensitive information. We should not
> > > expose it until we have authorization capabilities.
> > >
> > > In order to start moving this code from prototype to production state
> we
> > > should start with the most simple and consistent views. E.g.
> > IGNITE_CACHES.
> > > Let's move it to a separate PR, review infrastructure code, review view
> > > implementation, agree on proper naming and placement, and merge it.
> Then
> > > each and every view (or group of related views) should be discussed and
> > > reviewed separately.
> > >
> > > As far as node-local stuff, may be we should move it to a separate
> > schema,
> > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > transactions
> > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the local
> > > node. In this case we will be able to merge "local" stuff shortly, and
> > > implement more complex but at the same time much more useful
> distributed
> > > stuff later on.
> > >
> > > Makes sense?
> > >
> > > Vladimir.
> > >
> > >
> > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> [hidden email]>
> > > wrote:
> > >
> > > > Hello, Igniters!
> > > >
> > > > For Ignite diagnostic usually it’s helpful to get some Ignite
> internals
> > > > information. But currently, in my opinion, there are no convenient
> > tools
> > > > for this purpose:
> > > >
> > > > · Some issues can be solved by analyzing log files. Log files
> > are
> > > > useful for dumps, but sometimes they are difficult to read. Also
> > > > interesting metrics can’t be received runtime by request, we need to
> > wait
> > > > until Ignite will write these metrics by timeout or other events.
> > > >
> > > > · JMX is useful for scalar metrics. Complex and table data can
> > > also
> > > > be received, but it’s difficult to read, filter and sort them without
> > > > processing by specialized external tools. For most frequently used
> > cases
> > > > almost duplicating metrics are created to show data in an
> easy-to-read
> > > > form.
> > > >
> > > > · Web-console is able to show table and complex data. Perhaps,
> > > > someday web-console will contain all necessary dashboards for most
> > > problem
> > > > investigation, but some non-trivial queries will not be covered
> anyway.
> > > > Also web-console needs additional infrastructure to work.
> > > >
> > > > · External “home-made” tools can be used for non-trivial
> cases.
> > > They
> > > > cover highly specialized cases and usually can’t be used as general
> > > purpose
> > > > tools.
> > > >
> > > > Sometimes we are forced to use more than one tool and join data by
> > hands
> > > > (for example, current thread dump and data from logs).
> > > >
> > > > Often RDBMS for diagnostic purposes provides system views (for
> example,
> > > > DBA_% and V$% in Oracle), which can be queried by SQL. This solution
> > > makes
> > > > all internal diagnostic information available in a readable form
> (with
> > > all
> > > > possible filters and projections) without using any other internal or
> > > > external tools. My proposal is to create similar system views in
> > Ignite.
> > > >
> > > > I implement working prototype (PR: [1]). It contains views:
> > > >
> > > > IGNITE_SYSTEM_VIEWS
> > > >
> > > > Registered system views
> > > >
> > > > IGNITE_INSTANCE
> > > >
> > > > Ignite instance
> > > >
> > > > IGNITE_JVM_THREADS
> > > >
> > > > JVM threads
> > > >
> > > > IGNITE_JVM_RUNTIME
> > > >
> > > > JVM runtime
> > > >
> > > > IGNITE_JVM_OS
> > > >
> > > > JVM operating system
> > > >
> > > > IGNITE_CACHES
> > > >
> > > > Ignite caches
> > > >
> > > > IGNITE_CACHE_CLUSTER_METRICS
> > > >
> > > > Ignite cache cluster metrics
> > > >
> > > > IGNITE_CACHE_NODE_METRICS
> > > >
> > > > Ignite cache node metrics
> > > >
> > > > IGNITE_CACHE_GROUPS
> > > >
> > > > Cache groups
> > > >
> > > > IGNITE_NODES
> > > >
> > > > Nodes in topology
> > > >
> > > > IGNITE_NODE_HOSTS
> > > >
> > > > Node hosts
> > > >
> > > > IGNITE_NODE_ADDRESSES
> > > >
> > > > Node addresses
> > > >
> > > > IGNITE_NODE_ATTRIBUTES
> > > >
> > > > Node attributes
> > > >
> > > > IGNITE_NODE_METRICS
> > > >
> > > > Node metrics
> > > >
> > > > IGNITE_TRANSACTIONS
> > > >
> > > > Active transactions
> > > >
> > > > IGNITE_TRANSACTION_ENTRIES
> > > >
> > > > Cache entries used by transaction
> > > >
> > > > IGNITE_TASKS
> > > >
> > > > Active tasks
> > > >
> > > > IGNITE_PART_ASSIGNMENT
> > > >
> > > > Partition assignment map
> > > >
> > > > IGNITE_PART_ALLOCATION
> > > >
> > > > Partition allocation map
> > > >
> > > >
> > > >
> > > > There are much more useful views can be implemented (executors
> > > diagnostic,
> > > > SPIs diagnostic, etc).
> > > >
> > > > Some usage examples:
> > > >
> > > > Cache groups and their partitions, which used by transaction more
> than
> > 5
> > > > minutes long:
> > > >
> > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > ENTITIES_CNT
> > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> > te.XID
> > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID = cg.ID
> > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > >
> > > >
> > > >
> > > > Average CPU load on server nodes grouped by operating system:
> > > >
> > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
> n.ID
> > > AND
> > > > na.NAME = 'os.name'
> > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID = n.ID
> > > > WHERE n.IS_CLIENT = false
> > > > GROUP BY na.VALUE
> > > >
> > > >
> > > >
> > > > Top 5 nodes by puts to cache ‘cache’:
> > > >
> > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > WHERE cm.CACHE_NAME = 'cache'
> > > > ORDER BY cm.CACHE_PUTS DESC
> > > > LIMIT 5
> > > >
> > > >
> > > >
> > > > Does this implementation interesting to someone else? Maybe any views
> > are
> > > > redundant? Which additional first-priority views must be implemented?
> > Any
> > > > other thoughts or proposal?
> > > >
> > > > [1] https://github.com/apache/ignite/pull/3413
> > > >
> > >
> >
>

Alexey Plekhanov

Re: Ignite diagnostic (SQL system views)

Anton, Vladimir, I've made some fixes. There is only one view left and it's
renamed to 'IGNITE.LOCAL_TRANSACTIONS'.

High level design of solution:
When IgniteH2Indexing is starting, it create and start
new GridH2SysViewProcessor, which create and register in H2 (via its own
table engine) all implementations of system views. Each system view
implementation extends base abstract class GridH2SysView. View
implementation describes columns, their types and indexes in constructor
and must override method getRows for data retrieval (this method called by
H2-compatible table and index implementations for ignite system views).
Almost no fixes to existing parsing engine was made, except some places,
where GridH2Table instance was expected, but for system views there is
another class.

New PR: [1]. Please have a look.

[1] https://github.com/apache/ignite/pull/3433

2018-01-24 19:12 GMT+03:00 Anton Vinogradov <[hidden email]>:

> I've created IEP-13 [1] to cover all cases.
> Feel free to create issues.
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769
>
> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Let's start with a single and the most simple view, e.g.
> > LOCAL_TRANSACTIONS. We will review and merge it along with necessary
> > infrastructure. Then will handle the rest view in separate tickets and
> > separate focused discussions.
> >
> > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <[hidden email]>
> > wrote:
> >
> > > 1) It’s not a principal point, I can change schema. The
> > INFORMATION_SCHEMA
> > > was used because it’s already exists and usually used for metadata
> tables
> > > and views. Your proposal is to use schema “IGNITE”, am I understand you
> > > right? BTW, for now, we can’t query another (H2) meta tables from the
> > > INFORMATION_SCHEMA, so, “Ignite system views” is only available views
> to
> > > query from this schema.
> > > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> > determine
> > > which node we are connected to.
> > > 3) As the first phase, in my opinion, local views will be enough.
> > > Performance and caching of distributed views should be discussed at
> next
> > > phases, when distributed views implementation will be planned. In
> current
> > > implementation I tried to use indexing for local views wherever it’s
> > > possible.
> > > 4) I don’t think, that JVM info is more critical information than, for
> > > example, caches or nodes information. When authorization capabilities
> > > planned to implement?
> > >
> > > About local data: yes, we can rename all currently implemented views
> for
> > > the local node data as LOCAL_..., and create (someday) new whole
> cluster
> > > views (which use distributed requests) without prefix or, for example,
> > with
> > > CLUSTER_ prefix. But some views can show all cluster information using
> > only
> > > local node data, without distributed requests (for example
> > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
> > > IGNITE_NODES, etc). Are they local or cluster views in this concept?
> > Which
> > > prefix should be used? And what about caches? Are they local or
> cluster?
> > On
> > > local node we can see cluster wide caches (replicated and distributed)
> > and
> > > caches for current node only. Local caches list may differ from node to
> > > node. Which prefix should be used for this view? And one more, there is
> > no
> > > sense for some views to make them cluster wide (for example
> > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
> > > INSTANCE view?
> > >
> > > So, next steps: split PR, change schema name (IGNITE?), change view
> name
> > > for caches (CACHES, LOCAL_CACHES?)
> > >
> > >
> > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > >
> > > > Hi Alex,
> > > >
> > > > System views could be extremely valuable addition for Ignite.
> Ideally,
> > > user
> > > > should be able to monitor and manage state of the whole cluster with
> a
> > > > single SQL command line. We have plans to implement it for a very
> long
> > > > time. However, this is very sensitive task which should take a lot of
> > > > moving pieces in count, such as usability, consistency, performance,
> > > > security, etc..
> > > >
> > > > Let me point several major concerns I see at the moment:
> > > >
> > > > 1) Usability: INFORMATION_SCHEMA
> > > > This schema is part of SQL ANSI standard. When creating system views,
> > > some
> > > > vendors prefer to store them in completely different predefined
> schema
> > > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> > > > directly. Both approaches could work. However, the latter breaks
> > > separation
> > > > of concerns - we store typical metadata near to possibly sensitive
> > system
> > > > data. Also it makes security management more complex - system data is
> > > very
> > > > sensitive, and now we cannot simply grant access INFORMATIONAL_SCHEMA
> > to
> > > > user. Instead, we have to grant that access on per-view basis. For
> this
> > > > reason my preference is to store system tables in separate schema,
> not
> > in
> > > > INFORMATION_SCHEMA
> > > >
> > > > 2) Consistency: local data
> > > > One of implemented view GridH2SysViewImplInstance. Normally SQL users
> > > > communicate with Ignite through JDBC/ODBC drivers. These drivers are
> > > > connected to a single node, typically client node. Moreover, we will
> > > > introduce high-availability feature when drivers were able to connect
> > to
> > > > any address from a predefined list. It renders this view useless, as
> > you
> > > do
> > > > not know which node you connected to. Also, local-only data cannot be
> > > > joined in general case - you will receive different results on
> > different
> > > > nodes. The same goes for transactions, JVM info, etc.
> > > >
> > > > 3) Performance
> > > > Suppose we fixed consistency of transactions and now this view shows
> > > > transactions in the whole cluster with possibility to filter them by
> > > nodes
> > > > - this is what user would expect out of the box. Another problem
> > appears
> > > > then - performance. How would we collect necessary data? How would we
> > > > handle joins, when particular view could be scanned multiple times
> > during
> > > > query execution? How we achieve sensible consistency? Most probably
> we
> > > > would collect remote data once when query is started, cache it
> somehow
> > on
> > > > query session level, and then re-use during joins. But again, this
> > should
> > > > be discussed separately.
> > > >
> > > > 4) Security: JVM info
> > > > We should define clear boundaries of what info is exposed. JVM data
> > along
> > > > with running threads is critically sensitive information. We should
> not
> > > > expose it until we have authorization capabilities.
> > > >
> > > > In order to start moving this code from prototype to production state
> > we
> > > > should start with the most simple and consistent views. E.g.
> > > IGNITE_CACHES.
> > > > Let's move it to a separate PR, review infrastructure code, review
> view
> > > > implementation, agree on proper naming and placement, and merge it.
> > Then
> > > > each and every view (or group of related views) should be discussed
> and
> > > > reviewed separately.
> > > >
> > > > As far as node-local stuff, may be we should move it to a separate
> > > schema,
> > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > > transactions
> > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
> local
> > > > node. In this case we will be able to merge "local" stuff shortly,
> and
> > > > implement more complex but at the same time much more useful
> > distributed
> > > > stuff later on.
> > > >
> > > > Makes sense?
> > > >
> > > > Vladimir.
> > > >
> > > >
> > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Hello, Igniters!
> > > > >
> > > > > For Ignite diagnostic usually it’s helpful to get some Ignite
> > internals
> > > > > information. But currently, in my opinion, there are no convenient
> > > tools
> > > > > for this purpose:
> > > > >
> > > > > · Some issues can be solved by analyzing log files. Log
> files
> > > are
> > > > > useful for dumps, but sometimes they are difficult to read. Also
> > > > > interesting metrics can’t be received runtime by request, we need
> to
> > > wait
> > > > > until Ignite will write these metrics by timeout or other events.
> > > > >
> > > > > · JMX is useful for scalar metrics. Complex and table data
> can
> > > > also
> > > > > be received, but it’s difficult to read, filter and sort them
> without
> > > > > processing by specialized external tools. For most frequently used
> > > cases
> > > > > almost duplicating metrics are created to show data in an
> > easy-to-read
> > > > > form.
> > > > >
> > > > > · Web-console is able to show table and complex data.
> Perhaps,
> > > > > someday web-console will contain all necessary dashboards for most
> > > > problem
> > > > > investigation, but some non-trivial queries will not be covered
> > anyway.
> > > > > Also web-console needs additional infrastructure to work.
> > > > >
> > > > > · External “home-made” tools can be used for non-trivial
> > cases.
> > > > They
> > > > > cover highly specialized cases and usually can’t be used as general
> > > > purpose
> > > > > tools.
> > > > >
> > > > > Sometimes we are forced to use more than one tool and join data by
> > > hands
> > > > > (for example, current thread dump and data from logs).
> > > > >
> > > > > Often RDBMS for diagnostic purposes provides system views (for
> > example,
> > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This
> solution
> > > > makes
> > > > > all internal diagnostic information available in a readable form
> > (with
> > > > all
> > > > > possible filters and projections) without using any other internal
> or
> > > > > external tools. My proposal is to create similar system views in
> > > Ignite.
> > > > >
> > > > > I implement working prototype (PR: [1]). It contains views:
> > > > >
> > > > > IGNITE_SYSTEM_VIEWS
> > > > >
> > > > > Registered system views
> > > > >
> > > > > IGNITE_INSTANCE
> > > > >
> > > > > Ignite instance
> > > > >
> > > > > IGNITE_JVM_THREADS
> > > > >
> > > > > JVM threads
> > > > >
> > > > > IGNITE_JVM_RUNTIME
> > > > >
> > > > > JVM runtime
> > > > >
> > > > > IGNITE_JVM_OS
> > > > >
> > > > > JVM operating system
> > > > >
> > > > > IGNITE_CACHES
> > > > >
> > > > > Ignite caches
> > > > >
> > > > > IGNITE_CACHE_CLUSTER_METRICS
> > > > >
> > > > > Ignite cache cluster metrics
> > > > >
> > > > > IGNITE_CACHE_NODE_METRICS
> > > > >
> > > > > Ignite cache node metrics
> > > > >
> > > > > IGNITE_CACHE_GROUPS
> > > > >
> > > > > Cache groups
> > > > >
> > > > > IGNITE_NODES
> > > > >
> > > > > Nodes in topology
> > > > >
> > > > > IGNITE_NODE_HOSTS
> > > > >
> > > > > Node hosts
> > > > >
> > > > > IGNITE_NODE_ADDRESSES
> > > > >
> > > > > Node addresses
> > > > >
> > > > > IGNITE_NODE_ATTRIBUTES
> > > > >
> > > > > Node attributes
> > > > >
> > > > > IGNITE_NODE_METRICS
> > > > >
> > > > > Node metrics
> > > > >
> > > > > IGNITE_TRANSACTIONS
> > > > >
> > > > > Active transactions
> > > > >
> > > > > IGNITE_TRANSACTION_ENTRIES
> > > > >
> > > > > Cache entries used by transaction
> > > > >
> > > > > IGNITE_TASKS
> > > > >
> > > > > Active tasks
> > > > >
> > > > > IGNITE_PART_ASSIGNMENT
> > > > >
> > > > > Partition assignment map
> > > > >
> > > > > IGNITE_PART_ALLOCATION
> > > > >
> > > > > Partition allocation map
> > > > >
> > > > >
> > > > >
> > > > > There are much more useful views can be implemented (executors
> > > > diagnostic,
> > > > > SPIs diagnostic, etc).
> > > > >
> > > > > Some usage examples:
> > > > >
> > > > > Cache groups and their partitions, which used by transaction more
> > than
> > > 5
> > > > > minutes long:
> > > > >
> > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > > ENTITIES_CNT
> > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> > > te.XID
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
> cg.ID
> > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > >
> > > > >
> > > > >
> > > > > Average CPU load on server nodes grouped by operating system:
> > > > >
> > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
> > n.ID
> > > > AND
> > > > > na.NAME = 'os.name'
> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
> n.ID
> > > > > WHERE n.IS_CLIENT = false
> > > > > GROUP BY na.VALUE
> > > > >
> > > > >
> > > > >
> > > > > Top 5 nodes by puts to cache ‘cache’:
> > > > >
> > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > WHERE cm.CACHE_NAME = 'cache'
> > > > > ORDER BY cm.CACHE_PUTS DESC
> > > > > LIMIT 5
> > > > >
> > > > >
> > > > >
> > > > > Does this implementation interesting to someone else? Maybe any
> views
> > > are
> > > > > redundant? Which additional first-priority views must be
> implemented?
> > > Any
> > > > > other thoughts or proposal?
> > > > >
> > > > > [1] https://github.com/apache/ignite/pull/3413
> > > > >
> > > >
> > >
> >
>

Alexey Plekhanov

Re: Ignite diagnostic (SQL system views)

The views engine and the first view are almost ready to merge (review
comments are resolved). Which views should we take next? My proposal -
NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, since
these views are clear and all topology data available on each node.
Any objections?

2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]>:

> Anton, Vladimir, I've made some fixes. There is only one view left and
> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
>
> High level design of solution:
> When IgniteH2Indexing is starting, it create and start
> new GridH2SysViewProcessor, which create and register in H2 (via its own
> table engine) all implementations of system views. Each system view
> implementation extends base abstract class GridH2SysView. View
> implementation describes columns, their types and indexes in constructor
> and must override method getRows for data retrieval (this method called by
> H2-compatible table and index implementations for ignite system views).
> Almost no fixes to existing parsing engine was made, except some places,
> where GridH2Table instance was expected, but for system views there is
> another class.
>
> New PR: [1]. Please have a look.
>
> [1] https://github.com/apache/ignite/pull/3433
>
> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <[hidden email]>:
>
>> I've created IEP-13 [1] to cover all cases.
>> Feel free to create issues.
>>
>> [1]
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769
>>
>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <[hidden email]>
>> wrote:
>>
>> > Let's start with a single and the most simple view, e.g.
>> > LOCAL_TRANSACTIONS. We will review and merge it along with necessary
>> > infrastructure. Then will handle the rest view in separate tickets and
>> > separate focused discussions.
>> >
>> > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <[hidden email]
>> >
>> > wrote:
>> >
>> > > 1) It’s not a principal point, I can change schema. The
>> > INFORMATION_SCHEMA
>> > > was used because it’s already exists and usually used for metadata
>> tables
>> > > and views. Your proposal is to use schema “IGNITE”, am I understand
>> you
>> > > right? BTW, for now, we can’t query another (H2) meta tables from the
>> > > INFORMATION_SCHEMA, so, “Ignite system views” is only available views
>> to
>> > > query from this schema.
>> > > 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
>> > determine
>> > > which node we are connected to.
>> > > 3) As the first phase, in my opinion, local views will be enough.
>> > > Performance and caching of distributed views should be discussed at
>> next
>> > > phases, when distributed views implementation will be planned. In
>> current
>> > > implementation I tried to use indexing for local views wherever it’s
>> > > possible.
>> > > 4) I don’t think, that JVM info is more critical information than, for
>> > > example, caches or nodes information. When authorization capabilities
>> > > planned to implement?
>> > >
>> > > About local data: yes, we can rename all currently implemented views
>> for
>> > > the local node data as LOCAL_..., and create (someday) new whole
>> cluster
>> > > views (which use distributed requests) without prefix or, for example,
>> > with
>> > > CLUSTER_ prefix. But some views can show all cluster information using
>> > only
>> > > local node data, without distributed requests (for example
>> > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
>> > > IGNITE_NODES, etc). Are they local or cluster views in this concept?
>> > Which
>> > > prefix should be used? And what about caches? Are they local or
>> cluster?
>> > On
>> > > local node we can see cluster wide caches (replicated and distributed)
>> > and
>> > > caches for current node only. Local caches list may differ from node
>> to
>> > > node. Which prefix should be used for this view? And one more, there
>> is
>> > no
>> > > sense for some views to make them cluster wide (for example
>> > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
>> > > INSTANCE view?
>> > >
>> > > So, next steps: split PR, change schema name (IGNITE?), change view
>> name
>> > > for caches (CACHES, LOCAL_CACHES?)
>> > >
>> > >
>> > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>> > >
>> > > > Hi Alex,
>> > > >
>> > > > System views could be extremely valuable addition for Ignite.
>> Ideally,
>> > > user
>> > > > should be able to monitor and manage state of the whole cluster
>> with a
>> > > > single SQL command line. We have plans to implement it for a very
>> long
>> > > > time. However, this is very sensitive task which should take a lot
>> of
>> > > > moving pieces in count, such as usability, consistency, performance,
>> > > > security, etc..
>> > > >
>> > > > Let me point several major concerns I see at the moment:
>> > > >
>> > > > 1) Usability: INFORMATION_SCHEMA
>> > > > This schema is part of SQL ANSI standard. When creating system
>> views,
>> > > some
>> > > > vendors prefer to store them in completely different predefined
>> schema
>> > > > (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
>> > > > directly. Both approaches could work. However, the latter breaks
>> > > separation
>> > > > of concerns - we store typical metadata near to possibly sensitive
>> > system
>> > > > data. Also it makes security management more complex - system data
>> is
>> > > very
>> > > > sensitive, and now we cannot simply grant access
>> INFORMATIONAL_SCHEMA
>> > to
>> > > > user. Instead, we have to grant that access on per-view basis. For
>> this
>> > > > reason my preference is to store system tables in separate schema,
>> not
>> > in
>> > > > INFORMATION_SCHEMA
>> > > >
>> > > > 2) Consistency: local data
>> > > > One of implemented view GridH2SysViewImplInstance. Normally SQL
>> users
>> > > > communicate with Ignite through JDBC/ODBC drivers. These drivers are
>> > > > connected to a single node, typically client node. Moreover, we will
>> > > > introduce high-availability feature when drivers were able to
>> connect
>> > to
>> > > > any address from a predefined list. It renders this view useless, as
>> > you
>> > > do
>> > > > not know which node you connected to. Also, local-only data cannot
>> be
>> > > > joined in general case - you will receive different results on
>> > different
>> > > > nodes. The same goes for transactions, JVM info, etc.
>> > > >
>> > > > 3) Performance
>> > > > Suppose we fixed consistency of transactions and now this view shows
>> > > > transactions in the whole cluster with possibility to filter them by
>> > > nodes
>> > > > - this is what user would expect out of the box. Another problem
>> > appears
>> > > > then - performance. How would we collect necessary data? How would
>> we
>> > > > handle joins, when particular view could be scanned multiple times
>> > during
>> > > > query execution? How we achieve sensible consistency? Most probably
>> we
>> > > > would collect remote data once when query is started, cache it
>> somehow
>> > on
>> > > > query session level, and then re-use during joins. But again, this
>> > should
>> > > > be discussed separately.
>> > > >
>> > > > 4) Security: JVM info
>> > > > We should define clear boundaries of what info is exposed. JVM data
>> > along
>> > > > with running threads is critically sensitive information. We should
>> not
>> > > > expose it until we have authorization capabilities.
>> > > >
>> > > > In order to start moving this code from prototype to production
>> state
>> > we
>> > > > should start with the most simple and consistent views. E.g.
>> > > IGNITE_CACHES.
>> > > > Let's move it to a separate PR, review infrastructure code, review
>> view
>> > > > implementation, agree on proper naming and placement, and merge it.
>> > Then
>> > > > each and every view (or group of related views) should be discussed
>> and
>> > > > reviewed separately.
>> > > >
>> > > > As far as node-local stuff, may be we should move it to a separate
>> > > schema,
>> > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
>> > > transactions
>> > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
>> local
>> > > > node. In this case we will be able to merge "local" stuff shortly,
>> and
>> > > > implement more complex but at the same time much more useful
>> > distributed
>> > > > stuff later on.
>> > > >
>> > > > Makes sense?
>> > > >
>> > > > Vladimir.
>> > > >
>> > > >
>> > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
>> > [hidden email]>
>> > > > wrote:
>> > > >
>> > > > > Hello, Igniters!
>> > > > >
>> > > > > For Ignite diagnostic usually it’s helpful to get some Ignite
>> > internals
>> > > > > information. But currently, in my opinion, there are no convenient
>> > > tools
>> > > > > for this purpose:
>> > > > >
>> > > > > · Some issues can be solved by analyzing log files. Log
>> files
>> > > are
>> > > > > useful for dumps, but sometimes they are difficult to read. Also
>> > > > > interesting metrics can’t be received runtime by request, we need
>> to
>> > > wait
>> > > > > until Ignite will write these metrics by timeout or other events.
>> > > > >
>> > > > > · JMX is useful for scalar metrics. Complex and table data
>> can
>> > > > also
>> > > > > be received, but it’s difficult to read, filter and sort them
>> without
>> > > > > processing by specialized external tools. For most frequently used
>> > > cases
>> > > > > almost duplicating metrics are created to show data in an
>> > easy-to-read
>> > > > > form.
>> > > > >
>> > > > > · Web-console is able to show table and complex data.
>> Perhaps,
>> > > > > someday web-console will contain all necessary dashboards for
>> most
>> > > > problem
>> > > > > investigation, but some non-trivial queries will not be covered
>> > anyway.
>> > > > > Also web-console needs additional infrastructure to work.
>> > > > >
>> > > > > · External “home-made” tools can be used for non-trivial
>> > cases.
>> > > > They
>> > > > > cover highly specialized cases and usually can’t be used as
>> general
>> > > > purpose
>> > > > > tools.
>> > > > >
>> > > > > Sometimes we are forced to use more than one tool and join data by
>> > > hands
>> > > > > (for example, current thread dump and data from logs).
>> > > > >
>> > > > > Often RDBMS for diagnostic purposes provides system views (for
>> > example,
>> > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This
>> solution
>> > > > makes
>> > > > > all internal diagnostic information available in a readable form
>> > (with
>> > > > all
>> > > > > possible filters and projections) without using any other
>> internal or
>> > > > > external tools. My proposal is to create similar system views in
>> > > Ignite.
>> > > > >
>> > > > > I implement working prototype (PR: [1]). It contains views:
>> > > > >
>> > > > > IGNITE_SYSTEM_VIEWS
>> > > > >
>> > > > > Registered system views
>> > > > >
>> > > > > IGNITE_INSTANCE
>> > > > >
>> > > > > Ignite instance
>> > > > >
>> > > > > IGNITE_JVM_THREADS
>> > > > >
>> > > > > JVM threads
>> > > > >
>> > > > > IGNITE_JVM_RUNTIME
>> > > > >
>> > > > > JVM runtime
>> > > > >
>> > > > > IGNITE_JVM_OS
>> > > > >
>> > > > > JVM operating system
>> > > > >
>> > > > > IGNITE_CACHES
>> > > > >
>> > > > > Ignite caches
>> > > > >
>> > > > > IGNITE_CACHE_CLUSTER_METRICS
>> > > > >
>> > > > > Ignite cache cluster metrics
>> > > > >
>> > > > > IGNITE_CACHE_NODE_METRICS
>> > > > >
>> > > > > Ignite cache node metrics
>> > > > >
>> > > > > IGNITE_CACHE_GROUPS
>> > > > >
>> > > > > Cache groups
>> > > > >
>> > > > > IGNITE_NODES
>> > > > >
>> > > > > Nodes in topology
>> > > > >
>> > > > > IGNITE_NODE_HOSTS
>> > > > >
>> > > > > Node hosts
>> > > > >
>> > > > > IGNITE_NODE_ADDRESSES
>> > > > >
>> > > > > Node addresses
>> > > > >
>> > > > > IGNITE_NODE_ATTRIBUTES
>> > > > >
>> > > > > Node attributes
>> > > > >
>> > > > > IGNITE_NODE_METRICS
>> > > > >
>> > > > > Node metrics
>> > > > >
>> > > > > IGNITE_TRANSACTIONS
>> > > > >
>> > > > > Active transactions
>> > > > >
>> > > > > IGNITE_TRANSACTION_ENTRIES
>> > > > >
>> > > > > Cache entries used by transaction
>> > > > >
>> > > > > IGNITE_TASKS
>> > > > >
>> > > > > Active tasks
>> > > > >
>> > > > > IGNITE_PART_ASSIGNMENT
>> > > > >
>> > > > > Partition assignment map
>> > > > >
>> > > > > IGNITE_PART_ALLOCATION
>> > > > >
>> > > > > Partition allocation map
>> > > > >
>> > > > >
>> > > > >
>> > > > > There are much more useful views can be implemented (executors
>> > > > diagnostic,
>> > > > > SPIs diagnostic, etc).
>> > > > >
>> > > > > Some usage examples:
>> > > > >
>> > > > > Cache groups and their partitions, which used by transaction more
>> > than
>> > > 5
>> > > > > minutes long:
>> > > > >
>> > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
>> > > ENTITIES_CNT
>> > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
>> > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
>> > > te.XID
>> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
>> > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
>> cg.ID
>> > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
>> > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
>> > > > >
>> > > > >
>> > > > >
>> > > > > Average CPU load on server nodes grouped by operating system:
>> > > > >
>> > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
>> > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
>> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
>> > n.ID
>> > > > AND
>> > > > > na.NAME = 'os.name'
>> > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
>> n.ID
>> > > > > WHERE n.IS_CLIENT = false
>> > > > > GROUP BY na.VALUE
>> > > > >
>> > > > >
>> > > > >
>> > > > > Top 5 nodes by puts to cache ‘cache’:
>> > > > >
>> > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
>> > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
>> > > > > WHERE cm.CACHE_NAME = 'cache'
>> > > > > ORDER BY cm.CACHE_PUTS DESC
>> > > > > LIMIT 5
>> > > > >
>> > > > >
>> > > > >
>> > > > > Does this implementation interesting to someone else? Maybe any
>> views
>> > > are
>> > > > > redundant? Which additional first-priority views must be
>> implemented?
>> > > Any
>> > > > > other thoughts or proposal?
>> > > > >
>> > > > > [1] https://github.com/apache/ignite/pull/3413
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

dmagda

Re: Ignite diagnostic (SQL system views)

Alex P, sounds like a good plan for me.

Vladimir, do you have any suggestions or corrections?

—
Denis

> On Feb 12, 2018, at 4:57 AM, Alex Plehanov <[hidden email]> wrote:
>
> The views engine and the first view are almost ready to merge (review
> comments are resolved). Which views should we take next? My proposal -
> NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES, since
> these views are clear and all topology data available on each node.
> Any objections?
>
> 2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]>:
>
>> Anton, Vladimir, I've made some fixes. There is only one view left and
>> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
>>
>> High level design of solution:
>> When IgniteH2Indexing is starting, it create and start
>> new GridH2SysViewProcessor, which create and register in H2 (via its own
>> table engine) all implementations of system views. Each system view
>> implementation extends base abstract class GridH2SysView. View
>> implementation describes columns, their types and indexes in constructor
>> and must override method getRows for data retrieval (this method called by
>> H2-compatible table and index implementations for ignite system views).
>> Almost no fixes to existing parsing engine was made, except some places,
>> where GridH2Table instance was expected, but for system views there is
>> another class.
>>
>> New PR: [1]. Please have a look.
>>
>> [1] https://github.com/apache/ignite/pull/3433
>>
>> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <[hidden email]>:
>>
>>> I've created IEP-13 [1] to cover all cases.
>>> Feel free to create issues.
>>>
>>> [1]
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75962769
>>>
>>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <[hidden email]>
>>> wrote:
>>>
>>>> Let's start with a single and the most simple view, e.g.
>>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary
>>>> infrastructure. Then will handle the rest view in separate tickets and
>>>> separate focused discussions.
>>>>
>>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <[hidden email]
>>>>
>>>> wrote:
>>>>
>>>>> 1) It’s not a principal point, I can change schema. The
>>>> INFORMATION_SCHEMA
>>>>> was used because it’s already exists and usually used for metadata
>>> tables
>>>>> and views. Your proposal is to use schema “IGNITE”, am I understand
>>> you
>>>>> right? BTW, for now, we can’t query another (H2) meta tables from the
>>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available views
>>> to
>>>>> query from this schema.
>>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
>>>> determine
>>>>> which node we are connected to.
>>>>> 3) As the first phase, in my opinion, local views will be enough.
>>>>> Performance and caching of distributed views should be discussed at
>>> next
>>>>> phases, when distributed views implementation will be planned. In
>>> current
>>>>> implementation I tried to use indexing for local views wherever it’s
>>>>> possible.
>>>>> 4) I don’t think, that JVM info is more critical information than, for
>>>>> example, caches or nodes information. When authorization capabilities
>>>>> planned to implement?
>>>>>
>>>>> About local data: yes, we can rename all currently implemented views
>>> for
>>>>> the local node data as LOCAL_..., and create (someday) new whole
>>> cluster
>>>>> views (which use distributed requests) without prefix or, for example,
>>>> with
>>>>> CLUSTER_ prefix. But some views can show all cluster information using
>>>> only
>>>>> local node data, without distributed requests (for example
>>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
>>>>> IGNITE_NODES, etc). Are they local or cluster views in this concept?
>>>> Which
>>>>> prefix should be used? And what about caches? Are they local or
>>> cluster?
>>>> On
>>>>> local node we can see cluster wide caches (replicated and distributed)
>>>> and
>>>>> caches for current node only. Local caches list may differ from node
>>> to
>>>>> node. Which prefix should be used for this view? And one more, there
>>> is
>>>> no
>>>>> sense for some views to make them cluster wide (for example
>>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
>>>>> INSTANCE view?
>>>>>
>>>>> So, next steps: split PR, change schema name (IGNITE?), change view
>>> name
>>>>> for caches (CACHES, LOCAL_CACHES?)
>>>>>
>>>>>
>>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>>>>>
>>>>>> Hi Alex,
>>>>>>
>>>>>> System views could be extremely valuable addition for Ignite.
>>> Ideally,
>>>>> user
>>>>>> should be able to monitor and manage state of the whole cluster
>>> with a
>>>>>> single SQL command line. We have plans to implement it for a very
>>> long
>>>>>> time. However, this is very sensitive task which should take a lot
>>> of
>>>>>> moving pieces in count, such as usability, consistency, performance,
>>>>>> security, etc..
>>>>>>
>>>>>> Let me point several major concerns I see at the moment:
>>>>>>
>>>>>> 1) Usability: INFORMATION_SCHEMA
>>>>>> This schema is part of SQL ANSI standard. When creating system
>>> views,
>>>>> some
>>>>>> vendors prefer to store them in completely different predefined
>>> schema
>>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
>>>>>> directly. Both approaches could work. However, the latter breaks
>>>>> separation
>>>>>> of concerns - we store typical metadata near to possibly sensitive
>>>> system
>>>>>> data. Also it makes security management more complex - system data
>>> is
>>>>> very
>>>>>> sensitive, and now we cannot simply grant access
>>> INFORMATIONAL_SCHEMA
>>>> to
>>>>>> user. Instead, we have to grant that access on per-view basis. For
>>> this
>>>>>> reason my preference is to store system tables in separate schema,
>>> not
>>>> in
>>>>>> INFORMATION_SCHEMA
>>>>>>
>>>>>> 2) Consistency: local data
>>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL
>>> users
>>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers are
>>>>>> connected to a single node, typically client node. Moreover, we will
>>>>>> introduce high-availability feature when drivers were able to
>>> connect
>>>> to
>>>>>> any address from a predefined list. It renders this view useless, as
>>>> you
>>>>> do
>>>>>> not know which node you connected to. Also, local-only data cannot
>>> be
>>>>>> joined in general case - you will receive different results on
>>>> different
>>>>>> nodes. The same goes for transactions, JVM info, etc.
>>>>>>
>>>>>> 3) Performance
>>>>>> Suppose we fixed consistency of transactions and now this view shows
>>>>>> transactions in the whole cluster with possibility to filter them by
>>>>> nodes
>>>>>> - this is what user would expect out of the box. Another problem
>>>> appears
>>>>>> then - performance. How would we collect necessary data? How would
>>> we
>>>>>> handle joins, when particular view could be scanned multiple times
>>>> during
>>>>>> query execution? How we achieve sensible consistency? Most probably
>>> we
>>>>>> would collect remote data once when query is started, cache it
>>> somehow
>>>> on
>>>>>> query session level, and then re-use during joins. But again, this
>>>> should
>>>>>> be discussed separately.
>>>>>>
>>>>>> 4) Security: JVM info
>>>>>> We should define clear boundaries of what info is exposed. JVM data
>>>> along
>>>>>> with running threads is critically sensitive information. We should
>>> not
>>>>>> expose it until we have authorization capabilities.
>>>>>>
>>>>>> In order to start moving this code from prototype to production
>>> state
>>>> we
>>>>>> should start with the most simple and consistent views. E.g.
>>>>> IGNITE_CACHES.
>>>>>> Let's move it to a separate PR, review infrastructure code, review
>>> view
>>>>>> implementation, agree on proper naming and placement, and merge it.
>>>> Then
>>>>>> each and every view (or group of related views) should be discussed
>>> and
>>>>>> reviewed separately.
>>>>>>
>>>>>> As far as node-local stuff, may be we should move it to a separate
>>>>> schema,
>>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
>>>>> transactions
>>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
>>> local
>>>>>> node. In this case we will be able to merge "local" stuff shortly,
>>> and
>>>>>> implement more complex but at the same time much more useful
>>>> distributed
>>>>>> stuff later on.
>>>>>>
>>>>>> Makes sense?
>>>>>>
>>>>>> Vladimir.
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello, Igniters!
>>>>>>>
>>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite
>>>> internals
>>>>>>> information. But currently, in my opinion, there are no convenient
>>>>> tools
>>>>>>> for this purpose:
>>>>>>>
>>>>>>> · Some issues can be solved by analyzing log files. Log
>>> files
>>>>> are
>>>>>>> useful for dumps, but sometimes they are difficult to read. Also
>>>>>>> interesting metrics can’t be received runtime by request, we need
>>> to
>>>>> wait
>>>>>>> until Ignite will write these metrics by timeout or other events.
>>>>>>>
>>>>>>> · JMX is useful for scalar metrics. Complex and table data
>>> can
>>>>>> also
>>>>>>> be received, but it’s difficult to read, filter and sort them
>>> without
>>>>>>> processing by specialized external tools. For most frequently used
>>>>> cases
>>>>>>> almost duplicating metrics are created to show data in an
>>>> easy-to-read
>>>>>>> form.
>>>>>>>
>>>>>>> · Web-console is able to show table and complex data.
>>> Perhaps,
>>>>>>> someday web-console will contain all necessary dashboards for
>>> most
>>>>>> problem
>>>>>>> investigation, but some non-trivial queries will not be covered
>>>> anyway.
>>>>>>> Also web-console needs additional infrastructure to work.
>>>>>>>
>>>>>>> · External “home-made” tools can be used for non-trivial
>>>> cases.
>>>>>> They
>>>>>>> cover highly specialized cases and usually can’t be used as
>>> general
>>>>>> purpose
>>>>>>> tools.
>>>>>>>
>>>>>>> Sometimes we are forced to use more than one tool and join data by
>>>>> hands
>>>>>>> (for example, current thread dump and data from logs).
>>>>>>>
>>>>>>> Often RDBMS for diagnostic purposes provides system views (for
>>>> example,
>>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
>>> solution
>>>>>> makes
>>>>>>> all internal diagnostic information available in a readable form
>>>> (with
>>>>>> all
>>>>>>> possible filters and projections) without using any other
>>> internal or
>>>>>>> external tools. My proposal is to create similar system views in
>>>>> Ignite.
>>>>>>>
>>>>>>> I implement working prototype (PR: [1]). It contains views:
>>>>>>>
>>>>>>> IGNITE_SYSTEM_VIEWS
>>>>>>>
>>>>>>> Registered system views
>>>>>>>
>>>>>>> IGNITE_INSTANCE
>>>>>>>
>>>>>>> Ignite instance
>>>>>>>
>>>>>>> IGNITE_JVM_THREADS
>>>>>>>
>>>>>>> JVM threads
>>>>>>>
>>>>>>> IGNITE_JVM_RUNTIME
>>>>>>>
>>>>>>> JVM runtime
>>>>>>>
>>>>>>> IGNITE_JVM_OS
>>>>>>>
>>>>>>> JVM operating system
>>>>>>>
>>>>>>> IGNITE_CACHES
>>>>>>>
>>>>>>> Ignite caches
>>>>>>>
>>>>>>> IGNITE_CACHE_CLUSTER_METRICS
>>>>>>>
>>>>>>> Ignite cache cluster metrics
>>>>>>>
>>>>>>> IGNITE_CACHE_NODE_METRICS
>>>>>>>
>>>>>>> Ignite cache node metrics
>>>>>>>
>>>>>>> IGNITE_CACHE_GROUPS
>>>>>>>
>>>>>>> Cache groups
>>>>>>>
>>>>>>> IGNITE_NODES
>>>>>>>
>>>>>>> Nodes in topology
>>>>>>>
>>>>>>> IGNITE_NODE_HOSTS
>>>>>>>
>>>>>>> Node hosts
>>>>>>>
>>>>>>> IGNITE_NODE_ADDRESSES
>>>>>>>
>>>>>>> Node addresses
>>>>>>>
>>>>>>> IGNITE_NODE_ATTRIBUTES
>>>>>>>
>>>>>>> Node attributes
>>>>>>>
>>>>>>> IGNITE_NODE_METRICS
>>>>>>>
>>>>>>> Node metrics
>>>>>>>
>>>>>>> IGNITE_TRANSACTIONS
>>>>>>>
>>>>>>> Active transactions
>>>>>>>
>>>>>>> IGNITE_TRANSACTION_ENTRIES
>>>>>>>
>>>>>>> Cache entries used by transaction
>>>>>>>
>>>>>>> IGNITE_TASKS
>>>>>>>
>>>>>>> Active tasks
>>>>>>>
>>>>>>> IGNITE_PART_ASSIGNMENT
>>>>>>>
>>>>>>> Partition assignment map
>>>>>>>
>>>>>>> IGNITE_PART_ALLOCATION
>>>>>>>
>>>>>>> Partition allocation map
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> There are much more useful views can be implemented (executors
>>>>>> diagnostic,
>>>>>>> SPIs diagnostic, etc).
>>>>>>>
>>>>>>> Some usage examples:
>>>>>>>
>>>>>>> Cache groups and their partitions, which used by transaction more
>>>> than
>>>>> 5
>>>>>>> minutes long:
>>>>>>>
>>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
>>>>> ENTITIES_CNT
>>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
>>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
>>>>> te.XID
>>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
>>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
>>> cg.ID
>>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
>>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Average CPU load on server nodes grouped by operating system:
>>>>>>>
>>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
>>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
>>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
>>>> n.ID
>>>>>> AND
>>>>>>> na.NAME = 'os.name'
>>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
>>> n.ID
>>>>>>> WHERE n.IS_CLIENT = false
>>>>>>> GROUP BY na.VALUE
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Top 5 nodes by puts to cache ‘cache’:
>>>>>>>
>>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
>>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
>>>>>>> WHERE cm.CACHE_NAME = 'cache'
>>>>>>> ORDER BY cm.CACHE_PUTS DESC
>>>>>>> LIMIT 5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Does this implementation interesting to someone else? Maybe any
>>> views
>>>>> are
>>>>>>> redundant? Which additional first-priority views must be
>>> implemented?
>>>>> Any
>>>>>>> other thoughts or proposal?
>>>>>>>
>>>>>>> [1] https://github.com/apache/ignite/pull/3413
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>>

Vladimir Ozerov

Re: Ignite diagnostic (SQL system views)

I would start with NODES and NODE_ATTRIBUTES as the most simple thing.

On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]> wrote:

> Alex P, sounds like a good plan for me.
>
> Vladimir, do you have any suggestions or corrections?
>
> —
> Denis
>
> > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <[hidden email]>
> wrote:
> >
> > The views engine and the first view are almost ready to merge (review
> > comments are resolved). Which views should we take next? My proposal -
> > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES,
> since
> > these views are clear and all topology data available on each node.
> > Any objections?
> >
> > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]>:
> >
> >> Anton, Vladimir, I've made some fixes. There is only one view left and
> >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> >>
> >> High level design of solution:
> >> When IgniteH2Indexing is starting, it create and start
> >> new GridH2SysViewProcessor, which create and register in H2 (via its own
> >> table engine) all implementations of system views. Each system view
> >> implementation extends base abstract class GridH2SysView. View
> >> implementation describes columns, their types and indexes in constructor
> >> and must override method getRows for data retrieval (this method called
> by
> >> H2-compatible table and index implementations for ignite system views).
> >> Almost no fixes to existing parsing engine was made, except some places,
> >> where GridH2Table instance was expected, but for system views there is
> >> another class.
> >>
> >> New PR: [1]. Please have a look.
> >>
> >> [1] https://github.com/apache/ignite/pull/3433
> >>
> >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <[hidden email]>:
> >>
> >>> I've created IEP-13 [1] to cover all cases.
> >>> Feel free to create issues.
> >>>
> >>> [1]
> >>> https://cwiki.apache.org/confluence/pages/viewpage.
> action?pageId=75962769
> >>>
> >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <[hidden email]
> >
> >>> wrote:
> >>>
> >>>> Let's start with a single and the most simple view, e.g.
> >>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary
> >>>> infrastructure. Then will handle the rest view in separate tickets and
> >>>> separate focused discussions.
> >>>>
> >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> [hidden email]
> >>>>
> >>>> wrote:
> >>>>
> >>>>> 1) It’s not a principal point, I can change schema. The
> >>>> INFORMATION_SCHEMA
> >>>>> was used because it’s already exists and usually used for metadata
> >>> tables
> >>>>> and views. Your proposal is to use schema “IGNITE”, am I understand
> >>> you
> >>>>> right? BTW, for now, we can’t query another (H2) meta tables from the
> >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available views
> >>> to
> >>>>> query from this schema.
> >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> >>>> determine
> >>>>> which node we are connected to.
> >>>>> 3) As the first phase, in my opinion, local views will be enough.
> >>>>> Performance and caching of distributed views should be discussed at
> >>> next
> >>>>> phases, when distributed views implementation will be planned. In
> >>> current
> >>>>> implementation I tried to use indexing for local views wherever it’s
> >>>>> possible.
> >>>>> 4) I don’t think, that JVM info is more critical information than,
> for
> >>>>> example, caches or nodes information. When authorization capabilities
> >>>>> planned to implement?
> >>>>>
> >>>>> About local data: yes, we can rename all currently implemented views
> >>> for
> >>>>> the local node data as LOCAL_..., and create (someday) new whole
> >>> cluster
> >>>>> views (which use distributed requests) without prefix or, for
> example,
> >>>> with
> >>>>> CLUSTER_ prefix. But some views can show all cluster information
> using
> >>>> only
> >>>>> local node data, without distributed requests (for example
> >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT, IGNITE_PART_ALLOCATION,
> >>>>> IGNITE_NODES, etc). Are they local or cluster views in this concept?
> >>>> Which
> >>>>> prefix should be used? And what about caches? Are they local or
> >>> cluster?
> >>>> On
> >>>>> local node we can see cluster wide caches (replicated and
> distributed)
> >>>> and
> >>>>> caches for current node only. Local caches list may differ from node
> >>> to
> >>>>> node. Which prefix should be used for this view? And one more, there
> >>> is
> >>>> no
> >>>>> sense for some views to make them cluster wide (for example
> >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without creating
> >>>>> INSTANCE view?
> >>>>>
> >>>>> So, next steps: split PR, change schema name (IGNITE?), change view
> >>> name
> >>>>> for caches (CACHES, LOCAL_CACHES?)
> >>>>>
> >>>>>
> >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> >>>>>
> >>>>>> Hi Alex,
> >>>>>>
> >>>>>> System views could be extremely valuable addition for Ignite.
> >>> Ideally,
> >>>>> user
> >>>>>> should be able to monitor and manage state of the whole cluster
> >>> with a
> >>>>>> single SQL command line. We have plans to implement it for a very
> >>> long
> >>>>>> time. However, this is very sensitive task which should take a lot
> >>> of
> >>>>>> moving pieces in count, such as usability, consistency, performance,
> >>>>>> security, etc..
> >>>>>>
> >>>>>> Let me point several major concerns I see at the moment:
> >>>>>>
> >>>>>> 1) Usability: INFORMATION_SCHEMA
> >>>>>> This schema is part of SQL ANSI standard. When creating system
> >>> views,
> >>>>> some
> >>>>>> vendors prefer to store them in completely different predefined
> >>> schema
> >>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> >>>>>> directly. Both approaches could work. However, the latter breaks
> >>>>> separation
> >>>>>> of concerns - we store typical metadata near to possibly sensitive
> >>>> system
> >>>>>> data. Also it makes security management more complex - system data
> >>> is
> >>>>> very
> >>>>>> sensitive, and now we cannot simply grant access
> >>> INFORMATIONAL_SCHEMA
> >>>> to
> >>>>>> user. Instead, we have to grant that access on per-view basis. For
> >>> this
> >>>>>> reason my preference is to store system tables in separate schema,
> >>> not
> >>>> in
> >>>>>> INFORMATION_SCHEMA
> >>>>>>
> >>>>>> 2) Consistency: local data
> >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL
> >>> users
> >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers are
> >>>>>> connected to a single node, typically client node. Moreover, we will
> >>>>>> introduce high-availability feature when drivers were able to
> >>> connect
> >>>> to
> >>>>>> any address from a predefined list. It renders this view useless, as
> >>>> you
> >>>>> do
> >>>>>> not know which node you connected to. Also, local-only data cannot
> >>> be
> >>>>>> joined in general case - you will receive different results on
> >>>> different
> >>>>>> nodes. The same goes for transactions, JVM info, etc.
> >>>>>>
> >>>>>> 3) Performance
> >>>>>> Suppose we fixed consistency of transactions and now this view shows
> >>>>>> transactions in the whole cluster with possibility to filter them by
> >>>>> nodes
> >>>>>> - this is what user would expect out of the box. Another problem
> >>>> appears
> >>>>>> then - performance. How would we collect necessary data? How would
> >>> we
> >>>>>> handle joins, when particular view could be scanned multiple times
> >>>> during
> >>>>>> query execution? How we achieve sensible consistency? Most probably
> >>> we
> >>>>>> would collect remote data once when query is started, cache it
> >>> somehow
> >>>> on
> >>>>>> query session level, and then re-use during joins. But again, this
> >>>> should
> >>>>>> be discussed separately.
> >>>>>>
> >>>>>> 4) Security: JVM info
> >>>>>> We should define clear boundaries of what info is exposed. JVM data
> >>>> along
> >>>>>> with running threads is critically sensitive information. We should
> >>> not
> >>>>>> expose it until we have authorization capabilities.
> >>>>>>
> >>>>>> In order to start moving this code from prototype to production
> >>> state
> >>>> we
> >>>>>> should start with the most simple and consistent views. E.g.
> >>>>> IGNITE_CACHES.
> >>>>>> Let's move it to a separate PR, review infrastructure code, review
> >>> view
> >>>>>> implementation, agree on proper naming and placement, and merge it.
> >>>> Then
> >>>>>> each and every view (or group of related views) should be discussed
> >>> and
> >>>>>> reviewed separately.
> >>>>>>
> >>>>>> As far as node-local stuff, may be we should move it to a separate
> >>>>> schema,
> >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> >>>>> transactions
> >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
> >>> local
> >>>>>> node. In this case we will be able to merge "local" stuff shortly,
> >>> and
> >>>>>> implement more complex but at the same time much more useful
> >>>> distributed
> >>>>>> stuff later on.
> >>>>>>
> >>>>>> Makes sense?
> >>>>>>
> >>>>>> Vladimir.
> >>>>>>
> >>>>>>
> >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> >>>> [hidden email]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Hello, Igniters!
> >>>>>>>
> >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite
> >>>> internals
> >>>>>>> information. But currently, in my opinion, there are no convenient
> >>>>> tools
> >>>>>>> for this purpose:
> >>>>>>>
> >>>>>>> · Some issues can be solved by analyzing log files. Log
> >>> files
> >>>>> are
> >>>>>>> useful for dumps, but sometimes they are difficult to read. Also
> >>>>>>> interesting metrics can’t be received runtime by request, we need
> >>> to
> >>>>> wait
> >>>>>>> until Ignite will write these metrics by timeout or other events.
> >>>>>>>
> >>>>>>> · JMX is useful for scalar metrics. Complex and table data
> >>> can
> >>>>>> also
> >>>>>>> be received, but it’s difficult to read, filter and sort them
> >>> without
> >>>>>>> processing by specialized external tools. For most frequently used
> >>>>> cases
> >>>>>>> almost duplicating metrics are created to show data in an
> >>>> easy-to-read
> >>>>>>> form.
> >>>>>>>
> >>>>>>> · Web-console is able to show table and complex data.
> >>> Perhaps,
> >>>>>>> someday web-console will contain all necessary dashboards for
> >>> most
> >>>>>> problem
> >>>>>>> investigation, but some non-trivial queries will not be covered
> >>>> anyway.
> >>>>>>> Also web-console needs additional infrastructure to work.
> >>>>>>>
> >>>>>>> · External “home-made” tools can be used for non-trivial
> >>>> cases.
> >>>>>> They
> >>>>>>> cover highly specialized cases and usually can’t be used as
> >>> general
> >>>>>> purpose
> >>>>>>> tools.
> >>>>>>>
> >>>>>>> Sometimes we are forced to use more than one tool and join data by
> >>>>> hands
> >>>>>>> (for example, current thread dump and data from logs).
> >>>>>>>
> >>>>>>> Often RDBMS for diagnostic purposes provides system views (for
> >>>> example,
> >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
> >>> solution
> >>>>>> makes
> >>>>>>> all internal diagnostic information available in a readable form
> >>>> (with
> >>>>>> all
> >>>>>>> possible filters and projections) without using any other
> >>> internal or
> >>>>>>> external tools. My proposal is to create similar system views in
> >>>>> Ignite.
> >>>>>>>
> >>>>>>> I implement working prototype (PR: [1]). It contains views:
> >>>>>>>
> >>>>>>> IGNITE_SYSTEM_VIEWS
> >>>>>>>
> >>>>>>> Registered system views
> >>>>>>>
> >>>>>>> IGNITE_INSTANCE
> >>>>>>>
> >>>>>>> Ignite instance
> >>>>>>>
> >>>>>>> IGNITE_JVM_THREADS
> >>>>>>>
> >>>>>>> JVM threads
> >>>>>>>
> >>>>>>> IGNITE_JVM_RUNTIME
> >>>>>>>
> >>>>>>> JVM runtime
> >>>>>>>
> >>>>>>> IGNITE_JVM_OS
> >>>>>>>
> >>>>>>> JVM operating system
> >>>>>>>
> >>>>>>> IGNITE_CACHES
> >>>>>>>
> >>>>>>> Ignite caches
> >>>>>>>
> >>>>>>> IGNITE_CACHE_CLUSTER_METRICS
> >>>>>>>
> >>>>>>> Ignite cache cluster metrics
> >>>>>>>
> >>>>>>> IGNITE_CACHE_NODE_METRICS
> >>>>>>>
> >>>>>>> Ignite cache node metrics
> >>>>>>>
> >>>>>>> IGNITE_CACHE_GROUPS
> >>>>>>>
> >>>>>>> Cache groups
> >>>>>>>
> >>>>>>> IGNITE_NODES
> >>>>>>>
> >>>>>>> Nodes in topology
> >>>>>>>
> >>>>>>> IGNITE_NODE_HOSTS
> >>>>>>>
> >>>>>>> Node hosts
> >>>>>>>
> >>>>>>> IGNITE_NODE_ADDRESSES
> >>>>>>>
> >>>>>>> Node addresses
> >>>>>>>
> >>>>>>> IGNITE_NODE_ATTRIBUTES
> >>>>>>>
> >>>>>>> Node attributes
> >>>>>>>
> >>>>>>> IGNITE_NODE_METRICS
> >>>>>>>
> >>>>>>> Node metrics
> >>>>>>>
> >>>>>>> IGNITE_TRANSACTIONS
> >>>>>>>
> >>>>>>> Active transactions
> >>>>>>>
> >>>>>>> IGNITE_TRANSACTION_ENTRIES
> >>>>>>>
> >>>>>>> Cache entries used by transaction
> >>>>>>>
> >>>>>>> IGNITE_TASKS
> >>>>>>>
> >>>>>>> Active tasks
> >>>>>>>
> >>>>>>> IGNITE_PART_ASSIGNMENT
> >>>>>>>
> >>>>>>> Partition assignment map
> >>>>>>>
> >>>>>>> IGNITE_PART_ALLOCATION
> >>>>>>>
> >>>>>>> Partition allocation map
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> There are much more useful views can be implemented (executors
> >>>>>> diagnostic,
> >>>>>>> SPIs diagnostic, etc).
> >>>>>>>
> >>>>>>> Some usage examples:
> >>>>>>>
> >>>>>>> Cache groups and their partitions, which used by transaction more
> >>>> than
> >>>>> 5
> >>>>>>> minutes long:
> >>>>>>>
> >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> >>>>> ENTITIES_CNT
> >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> >>>>> te.XID
> >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME = c.NAME
> >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
> >>> cg.ID
> >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Average CPU load on server nodes grouped by operating system:
> >>>>>>>
> >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID =
> >>>> n.ID
> >>>>>> AND
> >>>>>>> na.NAME = 'os.name'
> >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
> >>> n.ID
> >>>>>>> WHERE n.IS_CLIENT = false
> >>>>>>> GROUP BY na.VALUE
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Top 5 nodes by puts to cache ‘cache’:
> >>>>>>>
> >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> >>>>>>> WHERE cm.CACHE_NAME = 'cache'
> >>>>>>> ORDER BY cm.CACHE_PUTS DESC
> >>>>>>> LIMIT 5
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Does this implementation interesting to someone else? Maybe any
> >>> views
> >>>>> are
> >>>>>>> redundant? Which additional first-priority views must be
> >>> implemented?
> >>>>> Any
> >>>>>>> other thoughts or proposal?
> >>>>>>>
> >>>>>>> [1] https://github.com/apache/ignite/pull/3413
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
>
>

Anton Vinogradov

Re: Ignite diagnostic (SQL system views)

Vova,

Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527 ready
to be merged?

On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <[hidden email]>
wrote:

> I would start with NODES and NODE_ATTRIBUTES as the most simple thing.
>
> On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]> wrote:
>
> > Alex P, sounds like a good plan for me.
> >
> > Vladimir, do you have any suggestions or corrections?
> >
> > —
> > Denis
> >
> > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <[hidden email]>
> > wrote:
> > >
> > > The views engine and the first view are almost ready to merge (review
> > > comments are resolved). Which views should we take next? My proposal -
> > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES,
> > since
> > > these views are clear and all topology data available on each node.
> > > Any objections?
> > >
> > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]>:
> > >
> > >> Anton, Vladimir, I've made some fixes. There is only one view left and
> > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > >>
> > >> High level design of solution:
> > >> When IgniteH2Indexing is starting, it create and start
> > >> new GridH2SysViewProcessor, which create and register in H2 (via its
> own
> > >> table engine) all implementations of system views. Each system view
> > >> implementation extends base abstract class GridH2SysView. View
> > >> implementation describes columns, their types and indexes in
> constructor
> > >> and must override method getRows for data retrieval (this method
> called
> > by
> > >> H2-compatible table and index implementations for ignite system
> views).
> > >> Almost no fixes to existing parsing engine was made, except some
> places,
> > >> where GridH2Table instance was expected, but for system views there is
> > >> another class.
> > >>
> > >> New PR: [1]. Please have a look.
> > >>
> > >> [1] https://github.com/apache/ignite/pull/3433
> > >>
> > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <[hidden email]
> >:
> > >>
> > >>> I've created IEP-13 [1] to cover all cases.
> > >>> Feel free to create issues.
> > >>>
> > >>> [1]
> > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > action?pageId=75962769
> > >>>
> > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> [hidden email]
> > >
> > >>> wrote:
> > >>>
> > >>>> Let's start with a single and the most simple view, e.g.
> > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with necessary
> > >>>> infrastructure. Then will handle the rest view in separate tickets
> and
> > >>>> separate focused discussions.
> > >>>>
> > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > [hidden email]
> > >>>>
> > >>>> wrote:
> > >>>>
> > >>>>> 1) It’s not a principal point, I can change schema. The
> > >>>> INFORMATION_SCHEMA
> > >>>>> was used because it’s already exists and usually used for metadata
> > >>> tables
> > >>>>> and views. Your proposal is to use schema “IGNITE”, am I understand
> > >>> you
> > >>>>> right? BTW, for now, we can’t query another (H2) meta tables from
> the
> > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available
> views
> > >>> to
> > >>>>> query from this schema.
> > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> > >>>> determine
> > >>>>> which node we are connected to.
> > >>>>> 3) As the first phase, in my opinion, local views will be enough.
> > >>>>> Performance and caching of distributed views should be discussed at
> > >>> next
> > >>>>> phases, when distributed views implementation will be planned. In
> > >>> current
> > >>>>> implementation I tried to use indexing for local views wherever
> it’s
> > >>>>> possible.
> > >>>>> 4) I don’t think, that JVM info is more critical information than,
> > for
> > >>>>> example, caches or nodes information. When authorization
> capabilities
> > >>>>> planned to implement?
> > >>>>>
> > >>>>> About local data: yes, we can rename all currently implemented
> views
> > >>> for
> > >>>>> the local node data as LOCAL_..., and create (someday) new whole
> > >>> cluster
> > >>>>> views (which use distributed requests) without prefix or, for
> > example,
> > >>>> with
> > >>>>> CLUSTER_ prefix. But some views can show all cluster information
> > using
> > >>>> only
> > >>>>> local node data, without distributed requests (for example
> > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> IGNITE_PART_ALLOCATION,
> > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this
> concept?
> > >>>> Which
> > >>>>> prefix should be used? And what about caches? Are they local or
> > >>> cluster?
> > >>>> On
> > >>>>> local node we can see cluster wide caches (replicated and
> > distributed)
> > >>>> and
> > >>>>> caches for current node only. Local caches list may differ from
> node
> > >>> to
> > >>>>> node. Which prefix should be used for this view? And one more,
> there
> > >>> is
> > >>>> no
> > >>>>> sense for some views to make them cluster wide (for example
> > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without
> creating
> > >>>>> INSTANCE view?
> > >>>>>
> > >>>>> So, next steps: split PR, change schema name (IGNITE?), change view
> > >>> name
> > >>>>> for caches (CACHES, LOCAL_CACHES?)
> > >>>>>
> > >>>>>
> > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]>:
> > >>>>>
> > >>>>>> Hi Alex,
> > >>>>>>
> > >>>>>> System views could be extremely valuable addition for Ignite.
> > >>> Ideally,
> > >>>>> user
> > >>>>>> should be able to monitor and manage state of the whole cluster
> > >>> with a
> > >>>>>> single SQL command line. We have plans to implement it for a very
> > >>> long
> > >>>>>> time. However, this is very sensitive task which should take a lot
> > >>> of
> > >>>>>> moving pieces in count, such as usability, consistency,
> performance,
> > >>>>>> security, etc..
> > >>>>>>
> > >>>>>> Let me point several major concerns I see at the moment:
> > >>>>>>
> > >>>>>> 1) Usability: INFORMATION_SCHEMA
> > >>>>>> This schema is part of SQL ANSI standard. When creating system
> > >>> views,
> > >>>>> some
> > >>>>>> vendors prefer to store them in completely different predefined
> > >>> schema
> > >>>>>> (Oracle, MS SQL). Others prefer to keep them in INFORMATION_SCHEMA
> > >>>>>> directly. Both approaches could work. However, the latter breaks
> > >>>>> separation
> > >>>>>> of concerns - we store typical metadata near to possibly sensitive
> > >>>> system
> > >>>>>> data. Also it makes security management more complex - system data
> > >>> is
> > >>>>> very
> > >>>>>> sensitive, and now we cannot simply grant access
> > >>> INFORMATIONAL_SCHEMA
> > >>>> to
> > >>>>>> user. Instead, we have to grant that access on per-view basis. For
> > >>> this
> > >>>>>> reason my preference is to store system tables in separate schema,
> > >>> not
> > >>>> in
> > >>>>>> INFORMATION_SCHEMA
> > >>>>>>
> > >>>>>> 2) Consistency: local data
> > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL
> > >>> users
> > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers
> are
> > >>>>>> connected to a single node, typically client node. Moreover, we
> will
> > >>>>>> introduce high-availability feature when drivers were able to
> > >>> connect
> > >>>> to
> > >>>>>> any address from a predefined list. It renders this view useless,
> as
> > >>>> you
> > >>>>> do
> > >>>>>> not know which node you connected to. Also, local-only data cannot
> > >>> be
> > >>>>>> joined in general case - you will receive different results on
> > >>>> different
> > >>>>>> nodes. The same goes for transactions, JVM info, etc.
> > >>>>>>
> > >>>>>> 3) Performance
> > >>>>>> Suppose we fixed consistency of transactions and now this view
> shows
> > >>>>>> transactions in the whole cluster with possibility to filter them
> by
> > >>>>> nodes
> > >>>>>> - this is what user would expect out of the box. Another problem
> > >>>> appears
> > >>>>>> then - performance. How would we collect necessary data? How would
> > >>> we
> > >>>>>> handle joins, when particular view could be scanned multiple times
> > >>>> during
> > >>>>>> query execution? How we achieve sensible consistency? Most
> probably
> > >>> we
> > >>>>>> would collect remote data once when query is started, cache it
> > >>> somehow
> > >>>> on
> > >>>>>> query session level, and then re-use during joins. But again, this
> > >>>> should
> > >>>>>> be discussed separately.
> > >>>>>>
> > >>>>>> 4) Security: JVM info
> > >>>>>> We should define clear boundaries of what info is exposed. JVM
> data
> > >>>> along
> > >>>>>> with running threads is critically sensitive information. We
> should
> > >>> not
> > >>>>>> expose it until we have authorization capabilities.
> > >>>>>>
> > >>>>>> In order to start moving this code from prototype to production
> > >>> state
> > >>>> we
> > >>>>>> should start with the most simple and consistent views. E.g.
> > >>>>> IGNITE_CACHES.
> > >>>>>> Let's move it to a separate PR, review infrastructure code, review
> > >>> view
> > >>>>>> implementation, agree on proper naming and placement, and merge
> it.
> > >>>> Then
> > >>>>>> each and every view (or group of related views) should be
> discussed
> > >>> and
> > >>>>>> reviewed separately.
> > >>>>>>
> > >>>>>> As far as node-local stuff, may be we should move it to a separate
> > >>>>> schema,
> > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > >>>>> transactions
> > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on the
> > >>> local
> > >>>>>> node. In this case we will be able to merge "local" stuff shortly,
> > >>> and
> > >>>>>> implement more complex but at the same time much more useful
> > >>>> distributed
> > >>>>>> stuff later on.
> > >>>>>>
> > >>>>>> Makes sense?
> > >>>>>>
> > >>>>>> Vladimir.
> > >>>>>>
> > >>>>>>
> > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > >>>> [hidden email]>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> Hello, Igniters!
> > >>>>>>>
> > >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite
> > >>>> internals
> > >>>>>>> information. But currently, in my opinion, there are no
> convenient
> > >>>>> tools
> > >>>>>>> for this purpose:
> > >>>>>>>
> > >>>>>>> · Some issues can be solved by analyzing log files. Log
> > >>> files
> > >>>>> are
> > >>>>>>> useful for dumps, but sometimes they are difficult to read. Also
> > >>>>>>> interesting metrics can’t be received runtime by request, we need
> > >>> to
> > >>>>> wait
> > >>>>>>> until Ignite will write these metrics by timeout or other events.
> > >>>>>>>
> > >>>>>>> · JMX is useful for scalar metrics. Complex and table data
> > >>> can
> > >>>>>> also
> > >>>>>>> be received, but it’s difficult to read, filter and sort them
> > >>> without
> > >>>>>>> processing by specialized external tools. For most frequently
> used
> > >>>>> cases
> > >>>>>>> almost duplicating metrics are created to show data in an
> > >>>> easy-to-read
> > >>>>>>> form.
> > >>>>>>>
> > >>>>>>> · Web-console is able to show table and complex data.
> > >>> Perhaps,
> > >>>>>>> someday web-console will contain all necessary dashboards for
> > >>> most
> > >>>>>> problem
> > >>>>>>> investigation, but some non-trivial queries will not be covered
> > >>>> anyway.
> > >>>>>>> Also web-console needs additional infrastructure to work.
> > >>>>>>>
> > >>>>>>> · External “home-made” tools can be used for non-trivial
> > >>>> cases.
> > >>>>>> They
> > >>>>>>> cover highly specialized cases and usually can’t be used as
> > >>> general
> > >>>>>> purpose
> > >>>>>>> tools.
> > >>>>>>>
> > >>>>>>> Sometimes we are forced to use more than one tool and join data
> by
> > >>>>> hands
> > >>>>>>> (for example, current thread dump and data from logs).
> > >>>>>>>
> > >>>>>>> Often RDBMS for diagnostic purposes provides system views (for
> > >>>> example,
> > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
> > >>> solution
> > >>>>>> makes
> > >>>>>>> all internal diagnostic information available in a readable form
> > >>>> (with
> > >>>>>> all
> > >>>>>>> possible filters and projections) without using any other
> > >>> internal or
> > >>>>>>> external tools. My proposal is to create similar system views in
> > >>>>> Ignite.
> > >>>>>>>
> > >>>>>>> I implement working prototype (PR: [1]). It contains views:
> > >>>>>>>
> > >>>>>>> IGNITE_SYSTEM_VIEWS
> > >>>>>>>
> > >>>>>>> Registered system views
> > >>>>>>>
> > >>>>>>> IGNITE_INSTANCE
> > >>>>>>>
> > >>>>>>> Ignite instance
> > >>>>>>>
> > >>>>>>> IGNITE_JVM_THREADS
> > >>>>>>>
> > >>>>>>> JVM threads
> > >>>>>>>
> > >>>>>>> IGNITE_JVM_RUNTIME
> > >>>>>>>
> > >>>>>>> JVM runtime
> > >>>>>>>
> > >>>>>>> IGNITE_JVM_OS
> > >>>>>>>
> > >>>>>>> JVM operating system
> > >>>>>>>
> > >>>>>>> IGNITE_CACHES
> > >>>>>>>
> > >>>>>>> Ignite caches
> > >>>>>>>
> > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS
> > >>>>>>>
> > >>>>>>> Ignite cache cluster metrics
> > >>>>>>>
> > >>>>>>> IGNITE_CACHE_NODE_METRICS
> > >>>>>>>
> > >>>>>>> Ignite cache node metrics
> > >>>>>>>
> > >>>>>>> IGNITE_CACHE_GROUPS
> > >>>>>>>
> > >>>>>>> Cache groups
> > >>>>>>>
> > >>>>>>> IGNITE_NODES
> > >>>>>>>
> > >>>>>>> Nodes in topology
> > >>>>>>>
> > >>>>>>> IGNITE_NODE_HOSTS
> > >>>>>>>
> > >>>>>>> Node hosts
> > >>>>>>>
> > >>>>>>> IGNITE_NODE_ADDRESSES
> > >>>>>>>
> > >>>>>>> Node addresses
> > >>>>>>>
> > >>>>>>> IGNITE_NODE_ATTRIBUTES
> > >>>>>>>
> > >>>>>>> Node attributes
> > >>>>>>>
> > >>>>>>> IGNITE_NODE_METRICS
> > >>>>>>>
> > >>>>>>> Node metrics
> > >>>>>>>
> > >>>>>>> IGNITE_TRANSACTIONS
> > >>>>>>>
> > >>>>>>> Active transactions
> > >>>>>>>
> > >>>>>>> IGNITE_TRANSACTION_ENTRIES
> > >>>>>>>
> > >>>>>>> Cache entries used by transaction
> > >>>>>>>
> > >>>>>>> IGNITE_TASKS
> > >>>>>>>
> > >>>>>>> Active tasks
> > >>>>>>>
> > >>>>>>> IGNITE_PART_ASSIGNMENT
> > >>>>>>>
> > >>>>>>> Partition assignment map
> > >>>>>>>
> > >>>>>>> IGNITE_PART_ALLOCATION
> > >>>>>>>
> > >>>>>>> Partition allocation map
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> There are much more useful views can be implemented (executors
> > >>>>>> diagnostic,
> > >>>>>>> SPIs diagnostic, etc).
> > >>>>>>>
> > >>>>>>> Some usage examples:
> > >>>>>>>
> > >>>>>>> Cache groups and their partitions, which used by transaction more
> > >>>> than
> > >>>>> 5
> > >>>>>>> minutes long:
> > >>>>>>>
> > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > >>>>> ENTITIES_CNT
> > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID =
> > >>>>> te.XID
> > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME =
> c.NAME
> > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
> > >>> cg.ID
> > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Average CPU load on server nodes grouped by operating system:
> > >>>>>>>
> > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID
> =
> > >>>> n.ID
> > >>>>>> AND
> > >>>>>>> na.NAME = 'os.name'
> > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
> > >>> n.ID
> > >>>>>>> WHERE n.IS_CLIENT = false
> > >>>>>>> GROUP BY na.VALUE
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Top 5 nodes by puts to cache ‘cache’:
> > >>>>>>>
> > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > >>>>>>> WHERE cm.CACHE_NAME = 'cache'
> > >>>>>>> ORDER BY cm.CACHE_PUTS DESC
> > >>>>>>> LIMIT 5
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> Does this implementation interesting to someone else? Maybe any
> > >>> views
> > >>>>> are
> > >>>>>>> redundant? Which additional first-priority views must be
> > >>> implemented?
> > >>>>> Any
> > >>>>>>> other thoughts or proposal?
> > >>>>>>>
> > >>>>>>> [1] https://github.com/apache/ignite/pull/3413
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >>
> >
> >
>

dmagda

Re: Ignite diagnostic (SQL system views)

Anton, Maxim,

Are we planning to release the views as part of 2.8? Don't see them listed
in the important features section:
https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#ApacheIgnite2.8-Themostimportantreleasetasks

-
Denis

On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <[hidden email]>
wrote:

> Vova,
>
> Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527 ready
> to be merged?
>
> On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > I would start with NODES and NODE_ATTRIBUTES as the most simple thing.
> >
> > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]> wrote:
> >
> > > Alex P, sounds like a good plan for me.
> > >
> > > Vladimir, do you have any suggestions or corrections?
> > >
> > > —
> > > Denis
> > >
> > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <[hidden email]>
> > > wrote:
> > > >
> > > > The views engine and the first view are almost ready to merge (review
> > > > comments are resolved). Which views should we take next? My proposal
> -
> > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and NODE_ADDRESSES,
> > > since
> > > > these views are clear and all topology data available on each node.
> > > > Any objections?
> > > >
> > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]>:
> > > >
> > > >> Anton, Vladimir, I've made some fixes. There is only one view left
> and
> > > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > >>
> > > >> High level design of solution:
> > > >> When IgniteH2Indexing is starting, it create and start
> > > >> new GridH2SysViewProcessor, which create and register in H2 (via its
> > own
> > > >> table engine) all implementations of system views. Each system view
> > > >> implementation extends base abstract class GridH2SysView. View
> > > >> implementation describes columns, their types and indexes in
> > constructor
> > > >> and must override method getRows for data retrieval (this method
> > called
> > > by
> > > >> H2-compatible table and index implementations for ignite system
> > views).
> > > >> Almost no fixes to existing parsing engine was made, except some
> > places,
> > > >> where GridH2Table instance was expected, but for system views there
> is
> > > >> another class.
> > > >>
> > > >> New PR: [1]. Please have a look.
> > > >>
> > > >> [1] https://github.com/apache/ignite/pull/3433
> > > >>
> > > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> [hidden email]
> > >:
> > > >>
> > > >>> I've created IEP-13 [1] to cover all cases.
> > > >>> Feel free to create issues.
> > > >>>
> > > >>> [1]
> > > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > > action?pageId=75962769
> > > >>>
> > > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > [hidden email]
> > > >
> > > >>> wrote:
> > > >>>
> > > >>>> Let's start with a single and the most simple view, e.g.
> > > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with
> necessary
> > > >>>> infrastructure. Then will handle the rest view in separate tickets
> > and
> > > >>>> separate focused discussions.
> > > >>>>
> > > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > [hidden email]
> > > >>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> 1) It’s not a principal point, I can change schema. The
> > > >>>> INFORMATION_SCHEMA
> > > >>>>> was used because it’s already exists and usually used for
> metadata
> > > >>> tables
> > > >>>>> and views. Your proposal is to use schema “IGNITE”, am I
> understand
> > > >>> you
> > > >>>>> right? BTW, for now, we can’t query another (H2) meta tables from
> > the
> > > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available
> > views
> > > >>> to
> > > >>>>> query from this schema.
> > > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful: to
> > > >>>> determine
> > > >>>>> which node we are connected to.
> > > >>>>> 3) As the first phase, in my opinion, local views will be enough.
> > > >>>>> Performance and caching of distributed views should be discussed
> at
> > > >>> next
> > > >>>>> phases, when distributed views implementation will be planned. In
> > > >>> current
> > > >>>>> implementation I tried to use indexing for local views wherever
> > it’s
> > > >>>>> possible.
> > > >>>>> 4) I don’t think, that JVM info is more critical information
> than,
> > > for
> > > >>>>> example, caches or nodes information. When authorization
> > capabilities
> > > >>>>> planned to implement?
> > > >>>>>
> > > >>>>> About local data: yes, we can rename all currently implemented
> > views
> > > >>> for
> > > >>>>> the local node data as LOCAL_..., and create (someday) new whole
> > > >>> cluster
> > > >>>>> views (which use distributed requests) without prefix or, for
> > > example,
> > > >>>> with
> > > >>>>> CLUSTER_ prefix. But some views can show all cluster information
> > > using
> > > >>>> only
> > > >>>>> local node data, without distributed requests (for example
> > > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > IGNITE_PART_ALLOCATION,
> > > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this
> > concept?
> > > >>>> Which
> > > >>>>> prefix should be used? And what about caches? Are they local or
> > > >>> cluster?
> > > >>>> On
> > > >>>>> local node we can see cluster wide caches (replicated and
> > > distributed)
> > > >>>> and
> > > >>>>> caches for current node only. Local caches list may differ from
> > node
> > > >>> to
> > > >>>>> node. Which prefix should be used for this view? And one more,
> > there
> > > >>> is
> > > >>>> no
> > > >>>>> sense for some views to make them cluster wide (for example
> > > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without
> > creating
> > > >>>>> INSTANCE view?
> > > >>>>>
> > > >>>>> So, next steps: split PR, change schema name (IGNITE?), change
> view
> > > >>> name
> > > >>>>> for caches (CACHES, LOCAL_CACHES?)
> > > >>>>>
> > > >>>>>
> > > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <[hidden email]
> >:
> > > >>>>>
> > > >>>>>> Hi Alex,
> > > >>>>>>
> > > >>>>>> System views could be extremely valuable addition for Ignite.
> > > >>> Ideally,
> > > >>>>> user
> > > >>>>>> should be able to monitor and manage state of the whole cluster
> > > >>> with a
> > > >>>>>> single SQL command line. We have plans to implement it for a
> very
> > > >>> long
> > > >>>>>> time. However, this is very sensitive task which should take a
> lot
> > > >>> of
> > > >>>>>> moving pieces in count, such as usability, consistency,
> > performance,
> > > >>>>>> security, etc..
> > > >>>>>>
> > > >>>>>> Let me point several major concerns I see at the moment:
> > > >>>>>>
> > > >>>>>> 1) Usability: INFORMATION_SCHEMA
> > > >>>>>> This schema is part of SQL ANSI standard. When creating system
> > > >>> views,
> > > >>>>> some
> > > >>>>>> vendors prefer to store them in completely different predefined
> > > >>> schema
> > > >>>>>> (Oracle, MS SQL). Others prefer to keep them in
> INFORMATION_SCHEMA
> > > >>>>>> directly. Both approaches could work. However, the latter breaks
> > > >>>>> separation
> > > >>>>>> of concerns - we store typical metadata near to possibly
> sensitive
> > > >>>> system
> > > >>>>>> data. Also it makes security management more complex - system
> data
> > > >>> is
> > > >>>>> very
> > > >>>>>> sensitive, and now we cannot simply grant access
> > > >>> INFORMATIONAL_SCHEMA
> > > >>>> to
> > > >>>>>> user. Instead, we have to grant that access on per-view basis.
> For
> > > >>> this
> > > >>>>>> reason my preference is to store system tables in separate
> schema,
> > > >>> not
> > > >>>> in
> > > >>>>>> INFORMATION_SCHEMA
> > > >>>>>>
> > > >>>>>> 2) Consistency: local data
> > > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally SQL
> > > >>> users
> > > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These drivers
> > are
> > > >>>>>> connected to a single node, typically client node. Moreover, we
> > will
> > > >>>>>> introduce high-availability feature when drivers were able to
> > > >>> connect
> > > >>>> to
> > > >>>>>> any address from a predefined list. It renders this view
> useless,
> > as
> > > >>>> you
> > > >>>>> do
> > > >>>>>> not know which node you connected to. Also, local-only data
> cannot
> > > >>> be
> > > >>>>>> joined in general case - you will receive different results on
> > > >>>> different
> > > >>>>>> nodes. The same goes for transactions, JVM info, etc.
> > > >>>>>>
> > > >>>>>> 3) Performance
> > > >>>>>> Suppose we fixed consistency of transactions and now this view
> > shows
> > > >>>>>> transactions in the whole cluster with possibility to filter
> them
> > by
> > > >>>>> nodes
> > > >>>>>> - this is what user would expect out of the box. Another problem
> > > >>>> appears
> > > >>>>>> then - performance. How would we collect necessary data? How
> would
> > > >>> we
> > > >>>>>> handle joins, when particular view could be scanned multiple
> times
> > > >>>> during
> > > >>>>>> query execution? How we achieve sensible consistency? Most
> > probably
> > > >>> we
> > > >>>>>> would collect remote data once when query is started, cache it
> > > >>> somehow
> > > >>>> on
> > > >>>>>> query session level, and then re-use during joins. But again,
> this
> > > >>>> should
> > > >>>>>> be discussed separately.
> > > >>>>>>
> > > >>>>>> 4) Security: JVM info
> > > >>>>>> We should define clear boundaries of what info is exposed. JVM
> > data
> > > >>>> along
> > > >>>>>> with running threads is critically sensitive information. We
> > should
> > > >>> not
> > > >>>>>> expose it until we have authorization capabilities.
> > > >>>>>>
> > > >>>>>> In order to start moving this code from prototype to production
> > > >>> state
> > > >>>> we
> > > >>>>>> should start with the most simple and consistent views. E.g.
> > > >>>>> IGNITE_CACHES.
> > > >>>>>> Let's move it to a separate PR, review infrastructure code,
> review
> > > >>> view
> > > >>>>>> implementation, agree on proper naming and placement, and merge
> > it.
> > > >>>> Then
> > > >>>>>> each and every view (or group of related views) should be
> > discussed
> > > >>> and
> > > >>>>>> reviewed separately.
> > > >>>>>>
> > > >>>>>> As far as node-local stuff, may be we should move it to a
> separate
> > > >>>>> schema,
> > > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > > >>>>> transactions
> > > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on
> the
> > > >>> local
> > > >>>>>> node. In this case we will be able to merge "local" stuff
> shortly,
> > > >>> and
> > > >>>>>> implement more complex but at the same time much more useful
> > > >>>> distributed
> > > >>>>>> stuff later on.
> > > >>>>>>
> > > >>>>>> Makes sense?
> > > >>>>>>
> > > >>>>>> Vladimir.
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > >>>> [hidden email]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> Hello, Igniters!
> > > >>>>>>>
> > > >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite
> > > >>>> internals
> > > >>>>>>> information. But currently, in my opinion, there are no
> > convenient
> > > >>>>> tools
> > > >>>>>>> for this purpose:
> > > >>>>>>>
> > > >>>>>>> · Some issues can be solved by analyzing log files. Log
> > > >>> files
> > > >>>>> are
> > > >>>>>>> useful for dumps, but sometimes they are difficult to read.
> Also
> > > >>>>>>> interesting metrics can’t be received runtime by request, we
> need
> > > >>> to
> > > >>>>> wait
> > > >>>>>>> until Ignite will write these metrics by timeout or other
> events.
> > > >>>>>>>
> > > >>>>>>> · JMX is useful for scalar metrics. Complex and table
> data
> > > >>> can
> > > >>>>>> also
> > > >>>>>>> be received, but it’s difficult to read, filter and sort them
> > > >>> without
> > > >>>>>>> processing by specialized external tools. For most frequently
> > used
> > > >>>>> cases
> > > >>>>>>> almost duplicating metrics are created to show data in an
> > > >>>> easy-to-read
> > > >>>>>>> form.
> > > >>>>>>>
> > > >>>>>>> · Web-console is able to show table and complex data.
> > > >>> Perhaps,
> > > >>>>>>> someday web-console will contain all necessary dashboards for
> > > >>> most
> > > >>>>>> problem
> > > >>>>>>> investigation, but some non-trivial queries will not be covered
> > > >>>> anyway.
> > > >>>>>>> Also web-console needs additional infrastructure to work.
> > > >>>>>>>
> > > >>>>>>> · External “home-made” tools can be used for non-trivial
> > > >>>> cases.
> > > >>>>>> They
> > > >>>>>>> cover highly specialized cases and usually can’t be used as
> > > >>> general
> > > >>>>>> purpose
> > > >>>>>>> tools.
> > > >>>>>>>
> > > >>>>>>> Sometimes we are forced to use more than one tool and join data
> > by
> > > >>>>> hands
> > > >>>>>>> (for example, current thread dump and data from logs).
> > > >>>>>>>
> > > >>>>>>> Often RDBMS for diagnostic purposes provides system views (for
> > > >>>> example,
> > > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
> > > >>> solution
> > > >>>>>> makes
> > > >>>>>>> all internal diagnostic information available in a readable
> form
> > > >>>> (with
> > > >>>>>> all
> > > >>>>>>> possible filters and projections) without using any other
> > > >>> internal or
> > > >>>>>>> external tools. My proposal is to create similar system views
> in
> > > >>>>> Ignite.
> > > >>>>>>>
> > > >>>>>>> I implement working prototype (PR: [1]). It contains views:
> > > >>>>>>>
> > > >>>>>>> IGNITE_SYSTEM_VIEWS
> > > >>>>>>>
> > > >>>>>>> Registered system views
> > > >>>>>>>
> > > >>>>>>> IGNITE_INSTANCE
> > > >>>>>>>
> > > >>>>>>> Ignite instance
> > > >>>>>>>
> > > >>>>>>> IGNITE_JVM_THREADS
> > > >>>>>>>
> > > >>>>>>> JVM threads
> > > >>>>>>>
> > > >>>>>>> IGNITE_JVM_RUNTIME
> > > >>>>>>>
> > > >>>>>>> JVM runtime
> > > >>>>>>>
> > > >>>>>>> IGNITE_JVM_OS
> > > >>>>>>>
> > > >>>>>>> JVM operating system
> > > >>>>>>>
> > > >>>>>>> IGNITE_CACHES
> > > >>>>>>>
> > > >>>>>>> Ignite caches
> > > >>>>>>>
> > > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS
> > > >>>>>>>
> > > >>>>>>> Ignite cache cluster metrics
> > > >>>>>>>
> > > >>>>>>> IGNITE_CACHE_NODE_METRICS
> > > >>>>>>>
> > > >>>>>>> Ignite cache node metrics
> > > >>>>>>>
> > > >>>>>>> IGNITE_CACHE_GROUPS
> > > >>>>>>>
> > > >>>>>>> Cache groups
> > > >>>>>>>
> > > >>>>>>> IGNITE_NODES
> > > >>>>>>>
> > > >>>>>>> Nodes in topology
> > > >>>>>>>
> > > >>>>>>> IGNITE_NODE_HOSTS
> > > >>>>>>>
> > > >>>>>>> Node hosts
> > > >>>>>>>
> > > >>>>>>> IGNITE_NODE_ADDRESSES
> > > >>>>>>>
> > > >>>>>>> Node addresses
> > > >>>>>>>
> > > >>>>>>> IGNITE_NODE_ATTRIBUTES
> > > >>>>>>>
> > > >>>>>>> Node attributes
> > > >>>>>>>
> > > >>>>>>> IGNITE_NODE_METRICS
> > > >>>>>>>
> > > >>>>>>> Node metrics
> > > >>>>>>>
> > > >>>>>>> IGNITE_TRANSACTIONS
> > > >>>>>>>
> > > >>>>>>> Active transactions
> > > >>>>>>>
> > > >>>>>>> IGNITE_TRANSACTION_ENTRIES
> > > >>>>>>>
> > > >>>>>>> Cache entries used by transaction
> > > >>>>>>>
> > > >>>>>>> IGNITE_TASKS
> > > >>>>>>>
> > > >>>>>>> Active tasks
> > > >>>>>>>
> > > >>>>>>> IGNITE_PART_ASSIGNMENT
> > > >>>>>>>
> > > >>>>>>> Partition assignment map
> > > >>>>>>>
> > > >>>>>>> IGNITE_PART_ALLOCATION
> > > >>>>>>>
> > > >>>>>>> Partition allocation map
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> There are much more useful views can be implemented (executors
> > > >>>>>> diagnostic,
> > > >>>>>>> SPIs diagnostic, etc).
> > > >>>>>>>
> > > >>>>>>> Some usage examples:
> > > >>>>>>>
> > > >>>>>>> Cache groups and their partitions, which used by transaction
> more
> > > >>>> than
> > > >>>>> 5
> > > >>>>>>> minutes long:
> > > >>>>>>>
> > > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > > >>>>> ENTITIES_CNT
> > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON t.XID
> =
> > > >>>>> te.XID
> > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME =
> > c.NAME
> > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID =
> > > >>> cg.ID
> > > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Average CPU load on server nodes grouped by operating system:
> > > >>>>>>>
> > > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD) AVG_CPU_LOAD
> > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON na.NODE_ID
> > =
> > > >>>> n.ID
> > > >>>>>> AND
> > > >>>>>>> na.NAME = 'os.name'
> > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID =
> > > >>> n.ID
> > > >>>>>>> WHERE n.IS_CLIENT = false
> > > >>>>>>> GROUP BY na.VALUE
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Top 5 nodes by puts to cache ‘cache’:
> > > >>>>>>>
> > > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > >>>>>>> WHERE cm.CACHE_NAME = 'cache'
> > > >>>>>>> ORDER BY cm.CACHE_PUTS DESC
> > > >>>>>>> LIMIT 5
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>>
> > > >>>>>>> Does this implementation interesting to someone else? Maybe any
> > > >>> views
> > > >>>>> are
> > > >>>>>>> redundant? Which additional first-priority views must be
> > > >>> implemented?
> > > >>>>> Any
> > > >>>>>>> other thoughts or proposal?
> > > >>>>>>>
> > > >>>>>>> [1] https://github.com/apache/ignite/pull/3413
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>
> > > >>
> > >
> > >
> >
>

Alexey Plekhanov

Re: Ignite diagnostic (SQL system views)

Denis,

Views engine and some views were released in AI 2.7.
In 2.8 they will be moved to the new engine and new views will be added (as
part of IEP-35)

пт, 18 окт. 2019 г. в 00:50, Denis Magda <[hidden email]>:

> Anton, Maxim,
>
> Are we planning to release the views as part of 2.8? Don't see them listed
> in the important features section:
>
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#ApacheIgnite2.8-Themostimportantreleasetasks
>
> -
> Denis
>
>
> On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <[hidden email]
> >
> wrote:
>
> > Vova,
> >
> > Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527
> ready
> > to be merged?
> >
> > On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > I would start with NODES and NODE_ATTRIBUTES as the most simple thing.
> > >
> > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]>
> wrote:
> > >
> > > > Alex P, sounds like a good plan for me.
> > > >
> > > > Vladimir, do you have any suggestions or corrections?
> > > >
> > > > —
> > > > Denis
> > > >
> > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <
> [hidden email]>
> > > > wrote:
> > > > >
> > > > > The views engine and the first view are almost ready to merge
> (review
> > > > > comments are resolved). Which views should we take next? My
> proposal
> > -
> > > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and
> NODE_ADDRESSES,
> > > > since
> > > > > these views are clear and all topology data available on each node.
> > > > > Any objections?
> > > > >
> > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <[hidden email]
> >:
> > > > >
> > > > >> Anton, Vladimir, I've made some fixes. There is only one view left
> > and
> > > > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > > >>
> > > > >> High level design of solution:
> > > > >> When IgniteH2Indexing is starting, it create and start
> > > > >> new GridH2SysViewProcessor, which create and register in H2 (via
> its
> > > own
> > > > >> table engine) all implementations of system views. Each system
> view
> > > > >> implementation extends base abstract class GridH2SysView. View
> > > > >> implementation describes columns, their types and indexes in
> > > constructor
> > > > >> and must override method getRows for data retrieval (this method
> > > called
> > > > by
> > > > >> H2-compatible table and index implementations for ignite system
> > > views).
> > > > >> Almost no fixes to existing parsing engine was made, except some
> > > places,
> > > > >> where GridH2Table instance was expected, but for system views
> there
> > is
> > > > >> another class.
> > > > >>
> > > > >> New PR: [1]. Please have a look.
> > > > >>
> > > > >> [1] https://github.com/apache/ignite/pull/3433
> > > > >>
> > > > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> > [hidden email]
> > > >:
> > > > >>
> > > > >>> I've created IEP-13 [1] to cover all cases.
> > > > >>> Feel free to create issues.
> > > > >>>
> > > > >>> [1]
> > > > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > > > action?pageId=75962769
> > > > >>>
> > > > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > > [hidden email]
> > > > >
> > > > >>> wrote:
> > > > >>>
> > > > >>>> Let's start with a single and the most simple view, e.g.
> > > > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with
> > necessary
> > > > >>>> infrastructure. Then will handle the rest view in separate
> tickets
> > > and
> > > > >>>> separate focused discussions.
> > > > >>>>
> > > > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > > [hidden email]
> > > > >>>>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> 1) It’s not a principal point, I can change schema. The
> > > > >>>> INFORMATION_SCHEMA
> > > > >>>>> was used because it’s already exists and usually used for
> > metadata
> > > > >>> tables
> > > > >>>>> and views. Your proposal is to use schema “IGNITE”, am I
> > understand
> > > > >>> you
> > > > >>>>> right? BTW, for now, we can’t query another (H2) meta tables
> from
> > > the
> > > > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only available
> > > views
> > > > >>> to
> > > > >>>>> query from this schema.
> > > > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is useful:
> to
> > > > >>>> determine
> > > > >>>>> which node we are connected to.
> > > > >>>>> 3) As the first phase, in my opinion, local views will be
> enough.
> > > > >>>>> Performance and caching of distributed views should be
> discussed
> > at
> > > > >>> next
> > > > >>>>> phases, when distributed views implementation will be planned.
> In
> > > > >>> current
> > > > >>>>> implementation I tried to use indexing for local views wherever
> > > it’s
> > > > >>>>> possible.
> > > > >>>>> 4) I don’t think, that JVM info is more critical information
> > than,
> > > > for
> > > > >>>>> example, caches or nodes information. When authorization
> > > capabilities
> > > > >>>>> planned to implement?
> > > > >>>>>
> > > > >>>>> About local data: yes, we can rename all currently implemented
> > > views
> > > > >>> for
> > > > >>>>> the local node data as LOCAL_..., and create (someday) new
> whole
> > > > >>> cluster
> > > > >>>>> views (which use distributed requests) without prefix or, for
> > > > example,
> > > > >>>> with
> > > > >>>>> CLUSTER_ prefix. But some views can show all cluster
> information
> > > > using
> > > > >>>> only
> > > > >>>>> local node data, without distributed requests (for example
> > > > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > > IGNITE_PART_ALLOCATION,
> > > > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this
> > > concept?
> > > > >>>> Which
> > > > >>>>> prefix should be used? And what about caches? Are they local or
> > > > >>> cluster?
> > > > >>>> On
> > > > >>>>> local node we can see cluster wide caches (replicated and
> > > > distributed)
> > > > >>>> and
> > > > >>>>> caches for current node only. Local caches list may differ from
> > > node
> > > > >>> to
> > > > >>>>> node. Which prefix should be used for this view? And one more,
> > > there
> > > > >>> is
> > > > >>>> no
> > > > >>>>> sense for some views to make them cluster wide (for example
> > > > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without
> > > creating
> > > > >>>>> INSTANCE view?
> > > > >>>>>
> > > > >>>>> So, next steps: split PR, change schema name (IGNITE?), change
> > view
> > > > >>> name
> > > > >>>>> for caches (CACHES, LOCAL_CACHES?)
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <
> [hidden email]
> > >:
> > > > >>>>>
> > > > >>>>>> Hi Alex,
> > > > >>>>>>
> > > > >>>>>> System views could be extremely valuable addition for Ignite.
> > > > >>> Ideally,
> > > > >>>>> user
> > > > >>>>>> should be able to monitor and manage state of the whole
> cluster
> > > > >>> with a
> > > > >>>>>> single SQL command line. We have plans to implement it for a
> > very
> > > > >>> long
> > > > >>>>>> time. However, this is very sensitive task which should take a
> > lot
> > > > >>> of
> > > > >>>>>> moving pieces in count, such as usability, consistency,
> > > performance,
> > > > >>>>>> security, etc..
> > > > >>>>>>
> > > > >>>>>> Let me point several major concerns I see at the moment:
> > > > >>>>>>
> > > > >>>>>> 1) Usability: INFORMATION_SCHEMA
> > > > >>>>>> This schema is part of SQL ANSI standard. When creating system
> > > > >>> views,
> > > > >>>>> some
> > > > >>>>>> vendors prefer to store them in completely different
> predefined
> > > > >>> schema
> > > > >>>>>> (Oracle, MS SQL). Others prefer to keep them in
> > INFORMATION_SCHEMA
> > > > >>>>>> directly. Both approaches could work. However, the latter
> breaks
> > > > >>>>> separation
> > > > >>>>>> of concerns - we store typical metadata near to possibly
> > sensitive
> > > > >>>> system
> > > > >>>>>> data. Also it makes security management more complex - system
> > data
> > > > >>> is
> > > > >>>>> very
> > > > >>>>>> sensitive, and now we cannot simply grant access
> > > > >>> INFORMATIONAL_SCHEMA
> > > > >>>> to
> > > > >>>>>> user. Instead, we have to grant that access on per-view basis.
> > For
> > > > >>> this
> > > > >>>>>> reason my preference is to store system tables in separate
> > schema,
> > > > >>> not
> > > > >>>> in
> > > > >>>>>> INFORMATION_SCHEMA
> > > > >>>>>>
> > > > >>>>>> 2) Consistency: local data
> > > > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally
> SQL
> > > > >>> users
> > > > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These
> drivers
> > > are
> > > > >>>>>> connected to a single node, typically client node. Moreover,
> we
> > > will
> > > > >>>>>> introduce high-availability feature when drivers were able to
> > > > >>> connect
> > > > >>>> to
> > > > >>>>>> any address from a predefined list. It renders this view
> > useless,
> > > as
> > > > >>>> you
> > > > >>>>> do
> > > > >>>>>> not know which node you connected to. Also, local-only data
> > cannot
> > > > >>> be
> > > > >>>>>> joined in general case - you will receive different results on
> > > > >>>> different
> > > > >>>>>> nodes. The same goes for transactions, JVM info, etc.
> > > > >>>>>>
> > > > >>>>>> 3) Performance
> > > > >>>>>> Suppose we fixed consistency of transactions and now this view
> > > shows
> > > > >>>>>> transactions in the whole cluster with possibility to filter
> > them
> > > by
> > > > >>>>> nodes
> > > > >>>>>> - this is what user would expect out of the box. Another
> problem
> > > > >>>> appears
> > > > >>>>>> then - performance. How would we collect necessary data? How
> > would
> > > > >>> we
> > > > >>>>>> handle joins, when particular view could be scanned multiple
> > times
> > > > >>>> during
> > > > >>>>>> query execution? How we achieve sensible consistency? Most
> > > probably
> > > > >>> we
> > > > >>>>>> would collect remote data once when query is started, cache it
> > > > >>> somehow
> > > > >>>> on
> > > > >>>>>> query session level, and then re-use during joins. But again,
> > this
> > > > >>>> should
> > > > >>>>>> be discussed separately.
> > > > >>>>>>
> > > > >>>>>> 4) Security: JVM info
> > > > >>>>>> We should define clear boundaries of what info is exposed. JVM
> > > data
> > > > >>>> along
> > > > >>>>>> with running threads is critically sensitive information. We
> > > should
> > > > >>> not
> > > > >>>>>> expose it until we have authorization capabilities.
> > > > >>>>>>
> > > > >>>>>> In order to start moving this code from prototype to
> production
> > > > >>> state
> > > > >>>> we
> > > > >>>>>> should start with the most simple and consistent views. E.g.
> > > > >>>>> IGNITE_CACHES.
> > > > >>>>>> Let's move it to a separate PR, review infrastructure code,
> > review
> > > > >>> view
> > > > >>>>>> implementation, agree on proper naming and placement, and
> merge
> > > it.
> > > > >>>> Then
> > > > >>>>>> each and every view (or group of related views) should be
> > > discussed
> > > > >>> and
> > > > >>>>>> reviewed separately.
> > > > >>>>>>
> > > > >>>>>> As far as node-local stuff, may be we should move it to a
> > separate
> > > > >>>>> schema,
> > > > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" - all
> > > > >>>>> transactions
> > > > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions on
> > the
> > > > >>> local
> > > > >>>>>> node. In this case we will be able to merge "local" stuff
> > shortly,
> > > > >>> and
> > > > >>>>>> implement more complex but at the same time much more useful
> > > > >>>> distributed
> > > > >>>>>> stuff later on.
> > > > >>>>>>
> > > > >>>>>> Makes sense?
> > > > >>>>>>
> > > > >>>>>> Vladimir.
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > > >>>> [hidden email]>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> Hello, Igniters!
> > > > >>>>>>>
> > > > >>>>>>> For Ignite diagnostic usually it’s helpful to get some Ignite
> > > > >>>> internals
> > > > >>>>>>> information. But currently, in my opinion, there are no
> > > convenient
> > > > >>>>> tools
> > > > >>>>>>> for this purpose:
> > > > >>>>>>>
> > > > >>>>>>> · Some issues can be solved by analyzing log files.
> Log
> > > > >>> files
> > > > >>>>> are
> > > > >>>>>>> useful for dumps, but sometimes they are difficult to read.
> > Also
> > > > >>>>>>> interesting metrics can’t be received runtime by request, we
> > need
> > > > >>> to
> > > > >>>>> wait
> > > > >>>>>>> until Ignite will write these metrics by timeout or other
> > events.
> > > > >>>>>>>
> > > > >>>>>>> · JMX is useful for scalar metrics. Complex and table
> > data
> > > > >>> can
> > > > >>>>>> also
> > > > >>>>>>> be received, but it’s difficult to read, filter and sort them
> > > > >>> without
> > > > >>>>>>> processing by specialized external tools. For most frequently
> > > used
> > > > >>>>> cases
> > > > >>>>>>> almost duplicating metrics are created to show data in an
> > > > >>>> easy-to-read
> > > > >>>>>>> form.
> > > > >>>>>>>
> > > > >>>>>>> · Web-console is able to show table and complex data.
> > > > >>> Perhaps,
> > > > >>>>>>> someday web-console will contain all necessary dashboards
> for
> > > > >>> most
> > > > >>>>>> problem
> > > > >>>>>>> investigation, but some non-trivial queries will not be
> covered
> > > > >>>> anyway.
> > > > >>>>>>> Also web-console needs additional infrastructure to work.
> > > > >>>>>>>
> > > > >>>>>>> · External “home-made” tools can be used for
> non-trivial
> > > > >>>> cases.
> > > > >>>>>> They
> > > > >>>>>>> cover highly specialized cases and usually can’t be used as
> > > > >>> general
> > > > >>>>>> purpose
> > > > >>>>>>> tools.
> > > > >>>>>>>
> > > > >>>>>>> Sometimes we are forced to use more than one tool and join
> data
> > > by
> > > > >>>>> hands
> > > > >>>>>>> (for example, current thread dump and data from logs).
> > > > >>>>>>>
> > > > >>>>>>> Often RDBMS for diagnostic purposes provides system views
> (for
> > > > >>>> example,
> > > > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
> > > > >>> solution
> > > > >>>>>> makes
> > > > >>>>>>> all internal diagnostic information available in a readable
> > form
> > > > >>>> (with
> > > > >>>>>> all
> > > > >>>>>>> possible filters and projections) without using any other
> > > > >>> internal or
> > > > >>>>>>> external tools. My proposal is to create similar system views
> > in
> > > > >>>>> Ignite.
> > > > >>>>>>>
> > > > >>>>>>> I implement working prototype (PR: [1]). It contains views:
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_SYSTEM_VIEWS
> > > > >>>>>>>
> > > > >>>>>>> Registered system views
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_INSTANCE
> > > > >>>>>>>
> > > > >>>>>>> Ignite instance
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_JVM_THREADS
> > > > >>>>>>>
> > > > >>>>>>> JVM threads
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_JVM_RUNTIME
> > > > >>>>>>>
> > > > >>>>>>> JVM runtime
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_JVM_OS
> > > > >>>>>>>
> > > > >>>>>>> JVM operating system
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_CACHES
> > > > >>>>>>>
> > > > >>>>>>> Ignite caches
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS
> > > > >>>>>>>
> > > > >>>>>>> Ignite cache cluster metrics
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_CACHE_NODE_METRICS
> > > > >>>>>>>
> > > > >>>>>>> Ignite cache node metrics
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_CACHE_GROUPS
> > > > >>>>>>>
> > > > >>>>>>> Cache groups
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_NODES
> > > > >>>>>>>
> > > > >>>>>>> Nodes in topology
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_NODE_HOSTS
> > > > >>>>>>>
> > > > >>>>>>> Node hosts
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_NODE_ADDRESSES
> > > > >>>>>>>
> > > > >>>>>>> Node addresses
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_NODE_ATTRIBUTES
> > > > >>>>>>>
> > > > >>>>>>> Node attributes
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_NODE_METRICS
> > > > >>>>>>>
> > > > >>>>>>> Node metrics
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_TRANSACTIONS
> > > > >>>>>>>
> > > > >>>>>>> Active transactions
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_TRANSACTION_ENTRIES
> > > > >>>>>>>
> > > > >>>>>>> Cache entries used by transaction
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_TASKS
> > > > >>>>>>>
> > > > >>>>>>> Active tasks
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_PART_ASSIGNMENT
> > > > >>>>>>>
> > > > >>>>>>> Partition assignment map
> > > > >>>>>>>
> > > > >>>>>>> IGNITE_PART_ALLOCATION
> > > > >>>>>>>
> > > > >>>>>>> Partition allocation map
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> There are much more useful views can be implemented
> (executors
> > > > >>>>>> diagnostic,
> > > > >>>>>>> SPIs diagnostic, etc).
> > > > >>>>>>>
> > > > >>>>>>> Some usage examples:
> > > > >>>>>>>
> > > > >>>>>>> Cache groups and their partitions, which used by transaction
> > more
> > > > >>>> than
> > > > >>>>> 5
> > > > >>>>>>> minutes long:
> > > > >>>>>>>
> > > > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*) AS
> > > > >>>>> ENTITIES_CNT
> > > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON
> t.XID
> > =
> > > > >>>>> te.XID
> > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME =
> > > c.NAME
> > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON c.GROUP_ID
> =
> > > > >>> cg.ID
> > > > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Average CPU load on server nodes grouped by operating system:
> > > > >>>>>>>
> > > > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD)
> AVG_CPU_LOAD
> > > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON
> na.NODE_ID
> > > =
> > > > >>>> n.ID
> > > > >>>>>> AND
> > > > >>>>>>> na.NAME = 'os.name'
> > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON nm.NODE_ID
> =
> > > > >>> n.ID
> > > > >>>>>>> WHERE n.IS_CLIENT = false
> > > > >>>>>>> GROUP BY na.VALUE
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Top 5 nodes by puts to cache ‘cache’:
> > > > >>>>>>>
> > > > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > >>>>>>> WHERE cm.CACHE_NAME = 'cache'
> > > > >>>>>>> ORDER BY cm.CACHE_PUTS DESC
> > > > >>>>>>> LIMIT 5
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>>
> > > > >>>>>>> Does this implementation interesting to someone else? Maybe
> any
> > > > >>> views
> > > > >>>>> are
> > > > >>>>>>> redundant? Which additional first-priority views must be
> > > > >>> implemented?
> > > > >>>>> Any
> > > > >>>>>>> other thoughts or proposal?
> > > > >>>>>>>
> > > > >>>>>>> [1] https://github.com/apache/ignite/pull/3413
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
>

dmagda

Re: Ignite diagnostic (SQL system views)

Alex, Igniters,

Who of us was contributing this feature? I don’t see any documentation, not
clear how the users are expected to benefit from the capability and how
everybody will be aware of the feature existence.

We need to close the gap and spread the word.

Denis

On Thursday, October 17, 2019, Alex Plehanov <[hidden email]>
wrote:

> Denis,
>
> Views engine and some views were released in AI 2.7.
> In 2.8 they will be moved to the new engine and new views will be added (as
> part of IEP-35)
>
>
> пт, 18 окт. 2019 г. в 00:50, Denis Magda <[hidden email]>:
>
> > Anton, Maxim,
> >
> > Are we planning to release the views as part of 2.8? Don't see them
> listed
> > in the important features section:
> >
> > https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#
> ApacheIgnite2.8-Themostimportantreleasetasks
> >
> > -
> > Denis
> >
> >
> > On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <
> [hidden email]
> > >
> > wrote:
> >
> > > Vova,
> > >
> > > Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527
> > ready
> > > to be merged?
> > >
> > > On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <
> [hidden email]>
> > > wrote:
> > >
> > > > I would start with NODES and NODE_ATTRIBUTES as the most simple
> thing.
> > > >
> > > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]>
> > wrote:
> > > >
> > > > > Alex P, sounds like a good plan for me.
> > > > >
> > > > > Vladimir, do you have any suggestions or corrections?
> > > > >
> > > > > —
> > > > > Denis
> > > > >
> > > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <
> > [hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > The views engine and the first view are almost ready to merge
> > (review
> > > > > > comments are resolved). Which views should we take next? My
> > proposal
> > > -
> > > > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and
> > NODE_ADDRESSES,
> > > > > since
> > > > > > these views are clear and all topology data available on each
> node.
> > > > > > Any objections?
> > > > > >
> > > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <
> [hidden email]
> > >:
> > > > > >
> > > > > >> Anton, Vladimir, I've made some fixes. There is only one view
> left
> > > and
> > > > > >> it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > > > >>
> > > > > >> High level design of solution:
> > > > > >> When IgniteH2Indexing is starting, it create and start
> > > > > >> new GridH2SysViewProcessor, which create and register in H2 (via
> > its
> > > > own
> > > > > >> table engine) all implementations of system views. Each system
> > view
> > > > > >> implementation extends base abstract class GridH2SysView. View
> > > > > >> implementation describes columns, their types and indexes in
> > > > constructor
> > > > > >> and must override method getRows for data retrieval (this method
> > > > called
> > > > > by
> > > > > >> H2-compatible table and index implementations for ignite system
> > > > views).
> > > > > >> Almost no fixes to existing parsing engine was made, except some
> > > > places,
> > > > > >> where GridH2Table instance was expected, but for system views
> > there
> > > is
> > > > > >> another class.
> > > > > >>
> > > > > >> New PR: [1]. Please have a look.
> > > > > >>
> > > > > >> [1] https://github.com/apache/ignite/pull/3433
> > > > > >>
> > > > > >> 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> > > [hidden email]
> > > > >:
> > > > > >>
> > > > > >>> I've created IEP-13 [1] to cover all cases.
> > > > > >>> Feel free to create issues.
> > > > > >>>
> > > > > >>> [1]
> > > > > >>> https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > action?pageId=75962769
> > > > > >>>
> > > > > >>> On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > > > [hidden email]
> > > > > >
> > > > > >>> wrote:
> > > > > >>>
> > > > > >>>> Let's start with a single and the most simple view, e.g.
> > > > > >>>> LOCAL_TRANSACTIONS. We will review and merge it along with
> > > necessary
> > > > > >>>> infrastructure. Then will handle the rest view in separate
> > tickets
> > > > and
> > > > > >>>> separate focused discussions.
> > > > > >>>>
> > > > > >>>> On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > > > [hidden email]
> > > > > >>>>
> > > > > >>>> wrote:
> > > > > >>>>
> > > > > >>>>> 1) It’s not a principal point, I can change schema. The
> > > > > >>>> INFORMATION_SCHEMA
> > > > > >>>>> was used because it’s already exists and usually used for
> > > metadata
> > > > > >>> tables
> > > > > >>>>> and views. Your proposal is to use schema “IGNITE”, am I
> > > understand
> > > > > >>> you
> > > > > >>>>> right? BTW, for now, we can’t query another (H2) meta tables
> > from
> > > > the
> > > > > >>>>> INFORMATION_SCHEMA, so, “Ignite system views” is only
> available
> > > > views
> > > > > >>> to
> > > > > >>>>> query from this schema.
> > > > > >>>>> 2) Exactly for this reason the IGNITE_INSTANCE view is
> useful:
> > to
> > > > > >>>> determine
> > > > > >>>>> which node we are connected to.
> > > > > >>>>> 3) As the first phase, in my opinion, local views will be
> > enough.
> > > > > >>>>> Performance and caching of distributed views should be
> > discussed
> > > at
> > > > > >>> next
> > > > > >>>>> phases, when distributed views implementation will be
> planned.
> > In
> > > > > >>> current
> > > > > >>>>> implementation I tried to use indexing for local views
> wherever
> > > > it’s
> > > > > >>>>> possible.
> > > > > >>>>> 4) I don’t think, that JVM info is more critical information
> > > than,
> > > > > for
> > > > > >>>>> example, caches or nodes information. When authorization
> > > > capabilities
> > > > > >>>>> planned to implement?
> > > > > >>>>>
> > > > > >>>>> About local data: yes, we can rename all currently
> implemented
> > > > views
> > > > > >>> for
> > > > > >>>>> the local node data as LOCAL_..., and create (someday) new
> > whole
> > > > > >>> cluster
> > > > > >>>>> views (which use distributed requests) without prefix or, for
> > > > > example,
> > > > > >>>> with
> > > > > >>>>> CLUSTER_ prefix. But some views can show all cluster
> > information
> > > > > using
> > > > > >>>> only
> > > > > >>>>> local node data, without distributed requests (for example
> > > > > >>>>> IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > > > IGNITE_PART_ALLOCATION,
> > > > > >>>>> IGNITE_NODES, etc). Are they local or cluster views in this
> > > > concept?
> > > > > >>>> Which
> > > > > >>>>> prefix should be used? And what about caches? Are they local
> or
> > > > > >>> cluster?
> > > > > >>>> On
> > > > > >>>>> local node we can see cluster wide caches (replicated and
> > > > > distributed)
> > > > > >>>> and
> > > > > >>>>> caches for current node only. Local caches list may differ
> from
> > > > node
> > > > > >>> to
> > > > > >>>>> node. Which prefix should be used for this view? And one
> more,
> > > > there
> > > > > >>> is
> > > > > >>>> no
> > > > > >>>>> sense for some views to make them cluster wide (for example
> > > > > >>>>> INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without
> > > > creating
> > > > > >>>>> INSTANCE view?
> > > > > >>>>>
> > > > > >>>>> So, next steps: split PR, change schema name (IGNITE?),
> change
> > > view
> > > > > >>> name
> > > > > >>>>> for caches (CACHES, LOCAL_CACHES?)
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <
> > [hidden email]
> > > >:
> > > > > >>>>>
> > > > > >>>>>> Hi Alex,
> > > > > >>>>>>
> > > > > >>>>>> System views could be extremely valuable addition for
> Ignite.
> > > > > >>> Ideally,
> > > > > >>>>> user
> > > > > >>>>>> should be able to monitor and manage state of the whole
> > cluster
> > > > > >>> with a
> > > > > >>>>>> single SQL command line. We have plans to implement it for a
> > > very
> > > > > >>> long
> > > > > >>>>>> time. However, this is very sensitive task which should
> take a
> > > lot
> > > > > >>> of
> > > > > >>>>>> moving pieces in count, such as usability, consistency,
> > > > performance,
> > > > > >>>>>> security, etc..
> > > > > >>>>>>
> > > > > >>>>>> Let me point several major concerns I see at the moment:
> > > > > >>>>>>
> > > > > >>>>>> 1) Usability: INFORMATION_SCHEMA
> > > > > >>>>>> This schema is part of SQL ANSI standard. When creating
> system
> > > > > >>> views,
> > > > > >>>>> some
> > > > > >>>>>> vendors prefer to store them in completely different
> > predefined
> > > > > >>> schema
> > > > > >>>>>> (Oracle, MS SQL). Others prefer to keep them in
> > > INFORMATION_SCHEMA
> > > > > >>>>>> directly. Both approaches could work. However, the latter
> > breaks
> > > > > >>>>> separation
> > > > > >>>>>> of concerns - we store typical metadata near to possibly
> > > sensitive
> > > > > >>>> system
> > > > > >>>>>> data. Also it makes security management more complex -
> system
> > > data
> > > > > >>> is
> > > > > >>>>> very
> > > > > >>>>>> sensitive, and now we cannot simply grant access
> > > > > >>> INFORMATIONAL_SCHEMA
> > > > > >>>> to
> > > > > >>>>>> user. Instead, we have to grant that access on per-view
> basis.
> > > For
> > > > > >>> this
> > > > > >>>>>> reason my preference is to store system tables in separate
> > > schema,
> > > > > >>> not
> > > > > >>>> in
> > > > > >>>>>> INFORMATION_SCHEMA
> > > > > >>>>>>
> > > > > >>>>>> 2) Consistency: local data
> > > > > >>>>>> One of implemented view GridH2SysViewImplInstance. Normally
> > SQL
> > > > > >>> users
> > > > > >>>>>> communicate with Ignite through JDBC/ODBC drivers. These
> > drivers
> > > > are
> > > > > >>>>>> connected to a single node, typically client node. Moreover,
> > we
> > > > will
> > > > > >>>>>> introduce high-availability feature when drivers were able
> to
> > > > > >>> connect
> > > > > >>>> to
> > > > > >>>>>> any address from a predefined list. It renders this view
> > > useless,
> > > > as
> > > > > >>>> you
> > > > > >>>>> do
> > > > > >>>>>> not know which node you connected to. Also, local-only data
> > > cannot
> > > > > >>> be
> > > > > >>>>>> joined in general case - you will receive different results
> on
> > > > > >>>> different
> > > > > >>>>>> nodes. The same goes for transactions, JVM info, etc.
> > > > > >>>>>>
> > > > > >>>>>> 3) Performance
> > > > > >>>>>> Suppose we fixed consistency of transactions and now this
> view
> > > > shows
> > > > > >>>>>> transactions in the whole cluster with possibility to filter
> > > them
> > > > by
> > > > > >>>>> nodes
> > > > > >>>>>> - this is what user would expect out of the box. Another
> > problem
> > > > > >>>> appears
> > > > > >>>>>> then - performance. How would we collect necessary data? How
> > > would
> > > > > >>> we
> > > > > >>>>>> handle joins, when particular view could be scanned multiple
> > > times
> > > > > >>>> during
> > > > > >>>>>> query execution? How we achieve sensible consistency? Most
> > > > probably
> > > > > >>> we
> > > > > >>>>>> would collect remote data once when query is started, cache
> it
> > > > > >>> somehow
> > > > > >>>> on
> > > > > >>>>>> query session level, and then re-use during joins. But
> again,
> > > this
> > > > > >>>> should
> > > > > >>>>>> be discussed separately.
> > > > > >>>>>>
> > > > > >>>>>> 4) Security: JVM info
> > > > > >>>>>> We should define clear boundaries of what info is exposed.
> JVM
> > > > data
> > > > > >>>> along
> > > > > >>>>>> with running threads is critically sensitive information. We
> > > > should
> > > > > >>> not
> > > > > >>>>>> expose it until we have authorization capabilities.
> > > > > >>>>>>
> > > > > >>>>>> In order to start moving this code from prototype to
> > production
> > > > > >>> state
> > > > > >>>> we
> > > > > >>>>>> should start with the most simple and consistent views. E.g.
> > > > > >>>>> IGNITE_CACHES.
> > > > > >>>>>> Let's move it to a separate PR, review infrastructure code,
> > > review
> > > > > >>> view
> > > > > >>>>>> implementation, agree on proper naming and placement, and
> > merge
> > > > it.
> > > > > >>>> Then
> > > > > >>>>>> each and every view (or group of related views) should be
> > > > discussed
> > > > > >>> and
> > > > > >>>>>> reviewed separately.
> > > > > >>>>>>
> > > > > >>>>>> As far as node-local stuff, may be we should move it to a
> > > separate
> > > > > >>>>> schema,
> > > > > >>>>>> or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" -
> all
> > > > > >>>>> transactions
> > > > > >>>>>> in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions
> on
> > > the
> > > > > >>> local
> > > > > >>>>>> node. In this case we will be able to merge "local" stuff
> > > shortly,
> > > > > >>> and
> > > > > >>>>>> implement more complex but at the same time much more useful
> > > > > >>>> distributed
> > > > > >>>>>> stuff later on.
> > > > > >>>>>>
> > > > > >>>>>> Makes sense?
> > > > > >>>>>>
> > > > > >>>>>> Vladimir.
> > > > > >>>>>>
> > > > > >>>>>>
> > > > > >>>>>> On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > > > >>>> [hidden email]>
> > > > > >>>>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>>> Hello, Igniters!
> > > > > >>>>>>>
> > > > > >>>>>>> For Ignite diagnostic usually it’s helpful to get some
> Ignite
> > > > > >>>> internals
> > > > > >>>>>>> information. But currently, in my opinion, there are no
> > > > convenient
> > > > > >>>>> tools
> > > > > >>>>>>> for this purpose:
> > > > > >>>>>>>
> > > > > >>>>>>> · Some issues can be solved by analyzing log files.
> > Log
> > > > > >>> files
> > > > > >>>>> are
> > > > > >>>>>>> useful for dumps, but sometimes they are difficult to read.
> > > Also
> > > > > >>>>>>> interesting metrics can’t be received runtime by request,
> we
> > > need
> > > > > >>> to
> > > > > >>>>> wait
> > > > > >>>>>>> until Ignite will write these metrics by timeout or other
> > > events.
> > > > > >>>>>>>
> > > > > >>>>>>> · JMX is useful for scalar metrics. Complex and
> table
> > > data
> > > > > >>> can
> > > > > >>>>>> also
> > > > > >>>>>>> be received, but it’s difficult to read, filter and sort
> them
> > > > > >>> without
> > > > > >>>>>>> processing by specialized external tools. For most
> frequently
> > > > used
> > > > > >>>>> cases
> > > > > >>>>>>> almost duplicating metrics are created to show data in an
> > > > > >>>> easy-to-read
> > > > > >>>>>>> form.
> > > > > >>>>>>>
> > > > > >>>>>>> · Web-console is able to show table and complex
> data.
> > > > > >>> Perhaps,
> > > > > >>>>>>> someday web-console will contain all necessary dashboards
> > for
> > > > > >>> most
> > > > > >>>>>> problem
> > > > > >>>>>>> investigation, but some non-trivial queries will not be
> > covered
> > > > > >>>> anyway.
> > > > > >>>>>>> Also web-console needs additional infrastructure to work.
> > > > > >>>>>>>
> > > > > >>>>>>> · External “home-made” tools can be used for
> > non-trivial
> > > > > >>>> cases.
> > > > > >>>>>> They
> > > > > >>>>>>> cover highly specialized cases and usually can’t be used as
> > > > > >>> general
> > > > > >>>>>> purpose
> > > > > >>>>>>> tools.
> > > > > >>>>>>>
> > > > > >>>>>>> Sometimes we are forced to use more than one tool and join
> > data
> > > > by
> > > > > >>>>> hands
> > > > > >>>>>>> (for example, current thread dump and data from logs).
> > > > > >>>>>>>
> > > > > >>>>>>> Often RDBMS for diagnostic purposes provides system views
> > (for
> > > > > >>>> example,
> > > > > >>>>>>> DBA_% and V$% in Oracle), which can be queried by SQL. This
> > > > > >>> solution
> > > > > >>>>>> makes
> > > > > >>>>>>> all internal diagnostic information available in a readable
> > > form
> > > > > >>>> (with
> > > > > >>>>>> all
> > > > > >>>>>>> possible filters and projections) without using any other
> > > > > >>> internal or
> > > > > >>>>>>> external tools. My proposal is to create similar system
> views
> > > in
> > > > > >>>>> Ignite.
> > > > > >>>>>>>
> > > > > >>>>>>> I implement working prototype (PR: [1]). It contains views:
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_SYSTEM_VIEWS
> > > > > >>>>>>>
> > > > > >>>>>>> Registered system views
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_INSTANCE
> > > > > >>>>>>>
> > > > > >>>>>>> Ignite instance
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_JVM_THREADS
> > > > > >>>>>>>
> > > > > >>>>>>> JVM threads
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_JVM_RUNTIME
> > > > > >>>>>>>
> > > > > >>>>>>> JVM runtime
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_JVM_OS
> > > > > >>>>>>>
> > > > > >>>>>>> JVM operating system
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_CACHES
> > > > > >>>>>>>
> > > > > >>>>>>> Ignite caches
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_CACHE_CLUSTER_METRICS
> > > > > >>>>>>>
> > > > > >>>>>>> Ignite cache cluster metrics
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_CACHE_NODE_METRICS
> > > > > >>>>>>>
> > > > > >>>>>>> Ignite cache node metrics
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_CACHE_GROUPS
> > > > > >>>>>>>
> > > > > >>>>>>> Cache groups
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_NODES
> > > > > >>>>>>>
> > > > > >>>>>>> Nodes in topology
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_NODE_HOSTS
> > > > > >>>>>>>
> > > > > >>>>>>> Node hosts
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_NODE_ADDRESSES
> > > > > >>>>>>>
> > > > > >>>>>>> Node addresses
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_NODE_ATTRIBUTES
> > > > > >>>>>>>
> > > > > >>>>>>> Node attributes
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_NODE_METRICS
> > > > > >>>>>>>
> > > > > >>>>>>> Node metrics
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_TRANSACTIONS
> > > > > >>>>>>>
> > > > > >>>>>>> Active transactions
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_TRANSACTION_ENTRIES
> > > > > >>>>>>>
> > > > > >>>>>>> Cache entries used by transaction
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_TASKS
> > > > > >>>>>>>
> > > > > >>>>>>> Active tasks
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_PART_ASSIGNMENT
> > > > > >>>>>>>
> > > > > >>>>>>> Partition assignment map
> > > > > >>>>>>>
> > > > > >>>>>>> IGNITE_PART_ALLOCATION
> > > > > >>>>>>>
> > > > > >>>>>>> Partition allocation map
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> There are much more useful views can be implemented
> > (executors
> > > > > >>>>>> diagnostic,
> > > > > >>>>>>> SPIs diagnostic, etc).
> > > > > >>>>>>>
> > > > > >>>>>>> Some usage examples:
> > > > > >>>>>>>
> > > > > >>>>>>> Cache groups and their partitions, which used by
> transaction
> > > more
> > > > > >>>> than
> > > > > >>>>> 5
> > > > > >>>>>>> minutes long:
> > > > > >>>>>>>
> > > > > >>>>>>> SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*)
> AS
> > > > > >>>>> ENTITIES_CNT
> > > > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON
> > t.XID
> > > =
> > > > > >>>>> te.XID
> > > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME =
> > > > c.NAME
> > > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON
> c.GROUP_ID
> > =
> > > > > >>> cg.ID
> > > > > >>>>>>> WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > > >>>>>>> GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Average CPU load on server nodes grouped by operating
> system:
> > > > > >>>>>>>
> > > > > >>>>>>> SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD)
> > AVG_CPU_LOAD
> > > > > >>>>>>> FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON
> > na.NODE_ID
> > > > =
> > > > > >>>> n.ID
> > > > > >>>>>> AND
> > > > > >>>>>>> na.NAME = 'os.name'
> > > > > >>>>>>> JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON
> nm.NODE_ID
> > =
> > > > > >>> n.ID
> > > > > >>>>>>> WHERE n.IS_CLIENT = false
> > > > > >>>>>>> GROUP BY na.VALUE
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Top 5 nodes by puts to cache ‘cache’:
> > > > > >>>>>>>
> > > > > >>>>>>> SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > >>>>>>> INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > >>>>>>> WHERE cm.CACHE_NAME = 'cache'
> > > > > >>>>>>> ORDER BY cm.CACHE_PUTS DESC
> > > > > >>>>>>> LIMIT 5
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> Does this implementation interesting to someone else? Maybe
> > any
> > > > > >>> views
> > > > > >>>>> are
> > > > > >>>>>>> redundant? Which additional first-priority views must be
> > > > > >>> implemented?
> > > > > >>>>> Any
> > > > > >>>>>>> other thoughts or proposal?
> > > > > >>>>>>>
> > > > > >>>>>>> [1] https://github.com/apache/ignite/pull/3413
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

--
-
Denis

Nikolay Izhikov-2

Re: Ignite diagnostic (SQL system views)

Denis, AFAIK we doesn't have documentation for the SQL System Views existing in Ignite.

I have plans to write a documentation about metrics and syste views that will cover SQL System View.
It will be available till 2.8 release.

В Пт, 18/10/2019 в 07:16 -0700, Denis Magda пишет:

> Alex, Igniters,
>
> Who of us was contributing this feature? I don’t see any documentation, not
> clear how the users are expected to benefit from the capability and how
> everybody will be aware of the feature existence.
>
> We need to close the gap and spread the word.
>
> Denis
>
> On Thursday, October 17, 2019, Alex Plehanov <[hidden email]>
> wrote:
>
> > Denis,
> >
> > Views engine and some views were released in AI 2.7.
> > In 2.8 they will be moved to the new engine and new views will be added (as
> > part of IEP-35)
> >
> >
> > пт, 18 окт. 2019 г. в 00:50, Denis Magda <[hidden email]>:
> >
> > > Anton, Maxim,
> > >
> > > Are we planning to release the views as part of 2.8? Don't see them
> >
> > listed
> > > in the important features section:
> > >
> > > https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#
> >
> > ApacheIgnite2.8-Themostimportantreleasetasks
> > >
> > > -
> > > Denis
> > >
> > >
> > > On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <
> >
> > [hidden email]
> > > >
> > >
> > > wrote:
> > >
> > > > Vova,
> > > >
> > > > Could you confirm https://issues.apache.org/jira/browse/IGNITE-7527
> > >
> > > ready
> > > > to be merged?
> > > >
> > > > On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <
> >
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > I would start with NODES and NODE_ATTRIBUTES as the most simple
> >
> > thing.
> > > > >
> > > > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]>
> > >
> > > wrote:
> > > > >
> > > > > > Alex P, sounds like a good plan for me.
> > > > > >
> > > > > > Vladimir, do you have any suggestions or corrections?
> > > > > >
> > > > > > —
> > > > > > Denis
> > > > > >
> > > > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <
> > >
> > > [hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > The views engine and the first view are almost ready to merge
> > >
> > > (review
> > > > > > > comments are resolved). Which views should we take next? My
> > >
> > > proposal
> > > > -
> > > > > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and
> > >
> > > NODE_ADDRESSES,
> > > > > > since
> > > > > > > these views are clear and all topology data available on each
> >
> > node.
> > > > > > > Any objections?
> > > > > > >
> > > > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <
> >
> > [hidden email]
> > > > :
> > > > > > >
> > > > > > > > Anton, Vladimir, I've made some fixes. There is only one view
> >
> > left
> > > > and
> > > > > > > > it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > > > > > >
> > > > > > > > High level design of solution:
> > > > > > > > When IgniteH2Indexing is starting, it create and start
> > > > > > > > new GridH2SysViewProcessor, which create and register in H2 (via
> > >
> > > its
> > > > > own
> > > > > > > > table engine) all implementations of system views. Each system
> > >
> > > view
> > > > > > > > implementation extends base abstract class GridH2SysView. View
> > > > > > > > implementation describes columns, their types and indexes in
> > > > >
> > > > > constructor
> > > > > > > > and must override method getRows for data retrieval (this method
> > > > >
> > > > > called
> > > > > > by
> > > > > > > > H2-compatible table and index implementations for ignite system
> > > > >
> > > > > views).
> > > > > > > > Almost no fixes to existing parsing engine was made, except some
> > > > >
> > > > > places,
> > > > > > > > where GridH2Table instance was expected, but for system views
> > >
> > > there
> > > > is
> > > > > > > > another class.
> > > > > > > >
> > > > > > > > New PR: [1]. Please have a look.
> > > > > > > >
> > > > > > > > [1] https://github.com/apache/ignite/pull/3433
> > > > > > > >
> > > > > > > > 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> > > >
> > > > [hidden email]
> > > > > > :
> > > > > > > >
> > > > > > > > > I've created IEP-13 [1] to cover all cases.
> > > > > > > > > Feel free to create issues.
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > >
> > > > > > action?pageId=75962769
> > > > > > > > >
> > > > > > > > > On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > > > >
> > > > > [hidden email]
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Let's start with a single and the most simple view, e.g.
> > > > > > > > > > LOCAL_TRANSACTIONS. We will review and merge it along with
> > > >
> > > > necessary
> > > > > > > > > > infrastructure. Then will handle the rest view in separate
> > >
> > > tickets
> > > > > and
> > > > > > > > > > separate focused discussions.
> > > > > > > > > >
> > > > > > > > > > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > > > >
> > > > > > [hidden email]
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > 1) It’s not a principal point, I can change schema. The
> > > > > > > > > >
> > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > was used because it’s already exists and usually used for
> > > >
> > > > metadata
> > > > > > > > > tables
> > > > > > > > > > > and views. Your proposal is to use schema “IGNITE”, am I
> > > >
> > > > understand
> > > > > > > > > you
> > > > > > > > > > > right? BTW, for now, we can’t query another (H2) meta tables
> > >
> > > from
> > > > > the
> > > > > > > > > > > INFORMATION_SCHEMA, so, “Ignite system views” is only
> >
> > available
> > > > > views
> > > > > > > > > to
> > > > > > > > > > > query from this schema.
> > > > > > > > > > > 2) Exactly for this reason the IGNITE_INSTANCE view is
> >
> > useful:
> > > to
> > > > > > > > > > determine
> > > > > > > > > > > which node we are connected to.
> > > > > > > > > > > 3) As the first phase, in my opinion, local views will be
> > >
> > > enough.
> > > > > > > > > > > Performance and caching of distributed views should be
> > >
> > > discussed
> > > > at
> > > > > > > > > next
> > > > > > > > > > > phases, when distributed views implementation will be
> >
> > planned.
> > > In
> > > > > > > > > current
> > > > > > > > > > > implementation I tried to use indexing for local views
> >
> > wherever
> > > > > it’s
> > > > > > > > > > > possible.
> > > > > > > > > > > 4) I don’t think, that JVM info is more critical information
> > > >
> > > > than,
> > > > > > for
> > > > > > > > > > > example, caches or nodes information. When authorization
> > > > >
> > > > > capabilities
> > > > > > > > > > > planned to implement?
> > > > > > > > > > >
> > > > > > > > > > > About local data: yes, we can rename all currently
> >
> > implemented
> > > > > views
> > > > > > > > > for
> > > > > > > > > > > the local node data as LOCAL_..., and create (someday) new
> > >
> > > whole
> > > > > > > > > cluster
> > > > > > > > > > > views (which use distributed requests) without prefix or, for
> > > > > >
> > > > > > example,
> > > > > > > > > > with
> > > > > > > > > > > CLUSTER_ prefix. But some views can show all cluster
> > >
> > > information
> > > > > > using
> > > > > > > > > > only
> > > > > > > > > > > local node data, without distributed requests (for example
> > > > > > > > > > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > > > >
> > > > > IGNITE_PART_ALLOCATION,
> > > > > > > > > > > IGNITE_NODES, etc). Are they local or cluster views in this
> > > > >
> > > > > concept?
> > > > > > > > > > Which
> > > > > > > > > > > prefix should be used? And what about caches? Are they local
> >
> > or
> > > > > > > > > cluster?
> > > > > > > > > > On
> > > > > > > > > > > local node we can see cluster wide caches (replicated and
> > > > > >
> > > > > > distributed)
> > > > > > > > > > and
> > > > > > > > > > > caches for current node only. Local caches list may differ
> >
> > from
> > > > > node
> > > > > > > > > to
> > > > > > > > > > > node. Which prefix should be used for this view? And one
> >
> > more,
> > > > > there
> > > > > > > > > is
> > > > > > > > > > no
> > > > > > > > > > > sense for some views to make them cluster wide (for example
> > > > > > > > > > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE without
> > > > >
> > > > > creating
> > > > > > > > > > > INSTANCE view?
> > > > > > > > > > >
> > > > > > > > > > > So, next steps: split PR, change schema name (IGNITE?),
> >
> > change
> > > > view
> > > > > > > > > name
> > > > > > > > > > > for caches (CACHES, LOCAL_CACHES?)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <
> > >
> > > [hidden email]
> > > > > :
> > > > > > > > > > >
> > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > >
> > > > > > > > > > > > System views could be extremely valuable addition for
> >
> > Ignite.
> > > > > > > > > Ideally,
> > > > > > > > > > > user
> > > > > > > > > > > > should be able to monitor and manage state of the whole
> > >
> > > cluster
> > > > > > > > > with a
> > > > > > > > > > > > single SQL command line. We have plans to implement it for a
> > > >
> > > > very
> > > > > > > > > long
> > > > > > > > > > > > time. However, this is very sensitive task which should
> >
> > take a
> > > > lot
> > > > > > > > > of
> > > > > > > > > > > > moving pieces in count, such as usability, consistency,
> > > > >
> > > > > performance,
> > > > > > > > > > > > security, etc..
> > > > > > > > > > > >
> > > > > > > > > > > > Let me point several major concerns I see at the moment:
> > > > > > > > > > > >
> > > > > > > > > > > > 1) Usability: INFORMATION_SCHEMA
> > > > > > > > > > > > This schema is part of SQL ANSI standard. When creating
> >
> > system
> > > > > > > > > views,
> > > > > > > > > > > some
> > > > > > > > > > > > vendors prefer to store them in completely different
> > >
> > > predefined
> > > > > > > > > schema
> > > > > > > > > > > > (Oracle, MS SQL). Others prefer to keep them in
> > > >
> > > > INFORMATION_SCHEMA
> > > > > > > > > > > > directly. Both approaches could work. However, the latter
> > >
> > > breaks
> > > > > > > > > > > separation
> > > > > > > > > > > > of concerns - we store typical metadata near to possibly
> > > >
> > > > sensitive
> > > > > > > > > > system
> > > > > > > > > > > > data. Also it makes security management more complex -
> >
> > system
> > > > data
> > > > > > > > > is
> > > > > > > > > > > very
> > > > > > > > > > > > sensitive, and now we cannot simply grant access
> > > > > > > > >
> > > > > > > > > INFORMATIONAL_SCHEMA
> > > > > > > > > > to
> > > > > > > > > > > > user. Instead, we have to grant that access on per-view
> >
> > basis.
> > > > For
> > > > > > > > > this
> > > > > > > > > > > > reason my preference is to store system tables in separate
> > > >
> > > > schema,
> > > > > > > > > not
> > > > > > > > > > in
> > > > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > >
> > > > > > > > > > > > 2) Consistency: local data
> > > > > > > > > > > > One of implemented view GridH2SysViewImplInstance. Normally
> > >
> > > SQL
> > > > > > > > > users
> > > > > > > > > > > > communicate with Ignite through JDBC/ODBC drivers. These
> > >
> > > drivers
> > > > > are
> > > > > > > > > > > > connected to a single node, typically client node. Moreover,
> > >
> > > we
> > > > > will
> > > > > > > > > > > > introduce high-availability feature when drivers were able
> >
> > to
> > > > > > > > > connect
> > > > > > > > > > to
> > > > > > > > > > > > any address from a predefined list. It renders this view
> > > >
> > > > useless,
> > > > > as
> > > > > > > > > > you
> > > > > > > > > > > do
> > > > > > > > > > > > not know which node you connected to. Also, local-only data
> > > >
> > > > cannot
> > > > > > > > > be
> > > > > > > > > > > > joined in general case - you will receive different results
> >
> > on
> > > > > > > > > > different
> > > > > > > > > > > > nodes. The same goes for transactions, JVM info, etc.
> > > > > > > > > > > >
> > > > > > > > > > > > 3) Performance
> > > > > > > > > > > > Suppose we fixed consistency of transactions and now this
> >
> > view
> > > > > shows
> > > > > > > > > > > > transactions in the whole cluster with possibility to filter
> > > >
> > > > them
> > > > > by
> > > > > > > > > > > nodes
> > > > > > > > > > > > - this is what user would expect out of the box. Another
> > >
> > > problem
> > > > > > > > > > appears
> > > > > > > > > > > > then - performance. How would we collect necessary data? How
> > > >
> > > > would
> > > > > > > > > we
> > > > > > > > > > > > handle joins, when particular view could be scanned multiple
> > > >
> > > > times
> > > > > > > > > > during
> > > > > > > > > > > > query execution? How we achieve sensible consistency? Most
> > > > >
> > > > > probably
> > > > > > > > > we
> > > > > > > > > > > > would collect remote data once when query is started, cache
> >
> > it
> > > > > > > > > somehow
> > > > > > > > > > on
> > > > > > > > > > > > query session level, and then re-use during joins. But
> >
> > again,
> > > > this
> > > > > > > > > > should
> > > > > > > > > > > > be discussed separately.
> > > > > > > > > > > >
> > > > > > > > > > > > 4) Security: JVM info
> > > > > > > > > > > > We should define clear boundaries of what info is exposed.
> >
> > JVM
> > > > > data
> > > > > > > > > > along
> > > > > > > > > > > > with running threads is critically sensitive information. We
> > > > >
> > > > > should
> > > > > > > > > not
> > > > > > > > > > > > expose it until we have authorization capabilities.
> > > > > > > > > > > >
> > > > > > > > > > > > In order to start moving this code from prototype to
> > >
> > > production
> > > > > > > > > state
> > > > > > > > > > we
> > > > > > > > > > > > should start with the most simple and consistent views. E.g.
> > > > > > > > > > >
> > > > > > > > > > > IGNITE_CACHES.
> > > > > > > > > > > > Let's move it to a separate PR, review infrastructure code,
> > > >
> > > > review
> > > > > > > > > view
> > > > > > > > > > > > implementation, agree on proper naming and placement, and
> > >
> > > merge
> > > > > it.
> > > > > > > > > > Then
> > > > > > > > > > > > each and every view (or group of related views) should be
> > > > >
> > > > > discussed
> > > > > > > > > and
> > > > > > > > > > > > reviewed separately.
> > > > > > > > > > > >
> > > > > > > > > > > > As far as node-local stuff, may be we should move it to a
> > > >
> > > > separate
> > > > > > > > > > > schema,
> > > > > > > > > > > > or mark with special prefix. E.g. "IGNITE.TRANSACTIONS" -
> >
> > all
> > > > > > > > > > > transactions
> > > > > > > > > > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" - transactions
> >
> > on
> > > > the
> > > > > > > > > local
> > > > > > > > > > > > node. In this case we will be able to merge "local" stuff
> > > >
> > > > shortly,
> > > > > > > > > and
> > > > > > > > > > > > implement more complex but at the same time much more useful
> > > > > > > > > >
> > > > > > > > > > distributed
> > > > > > > > > > > > stuff later on.
> > > > > > > > > > > >
> > > > > > > > > > > > Makes sense?
> > > > > > > > > > > >
> > > > > > > > > > > > Vladimir.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > > > > > > > >
> > > > > > > > > > [hidden email]>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hello, Igniters!
> > > > > > > > > > > > >
> > > > > > > > > > > > > For Ignite diagnostic usually it’s helpful to get some
> >
> > Ignite
> > > > > > > > > > internals
> > > > > > > > > > > > > information. But currently, in my opinion, there are no
> > > > >
> > > > > convenient
> > > > > > > > > > > tools
> > > > > > > > > > > > > for this purpose:
> > > > > > > > > > > > >
> > > > > > > > > > > > > · Some issues can be solved by analyzing log files.
> > >
> > > Log
> > > > > > > > > files
> > > > > > > > > > > are
> > > > > > > > > > > > > useful for dumps, but sometimes they are difficult to read.
> > > >
> > > > Also
> > > > > > > > > > > > > interesting metrics can’t be received runtime by request,
> >
> > we
> > > > need
> > > > > > > > > to
> > > > > > > > > > > wait
> > > > > > > > > > > > > until Ignite will write these metrics by timeout or other
> > > >
> > > > events.
> > > > > > > > > > > > >
> > > > > > > > > > > > > · JMX is useful for scalar metrics. Complex and
> >
> > table
> > > > data
> > > > > > > > > can
> > > > > > > > > > > > also
> > > > > > > > > > > > > be received, but it’s difficult to read, filter and sort
> >
> > them
> > > > > > > > > without
> > > > > > > > > > > > > processing by specialized external tools. For most
> >
> > frequently
> > > > > used
> > > > > > > > > > > cases
> > > > > > > > > > > > > almost duplicating metrics are created to show data in an
> > > > > > > > > >
> > > > > > > > > > easy-to-read
> > > > > > > > > > > > > form.
> > > > > > > > > > > > >
> > > > > > > > > > > > > · Web-console is able to show table and complex
> >
> > data.
> > > > > > > > > Perhaps,
> > > > > > > > > > > > > someday web-console will contain all necessary dashboards
> > >
> > > for
> > > > > > > > > most
> > > > > > > > > > > > problem
> > > > > > > > > > > > > investigation, but some non-trivial queries will not be
> > >
> > > covered
> > > > > > > > > > anyway.
> > > > > > > > > > > > > Also web-console needs additional infrastructure to work.
> > > > > > > > > > > > >
> > > > > > > > > > > > > · External “home-made” tools can be used for
> > >
> > > non-trivial
> > > > > > > > > > cases.
> > > > > > > > > > > > They
> > > > > > > > > > > > > cover highly specialized cases and usually can’t be used as
> > > > > > > > >
> > > > > > > > > general
> > > > > > > > > > > > purpose
> > > > > > > > > > > > > tools.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sometimes we are forced to use more than one tool and join
> > >
> > > data
> > > > > by
> > > > > > > > > > > hands
> > > > > > > > > > > > > (for example, current thread dump and data from logs).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Often RDBMS for diagnostic purposes provides system views
> > >
> > > (for
> > > > > > > > > > example,
> > > > > > > > > > > > > DBA_% and V$% in Oracle), which can be queried by SQL. This
> > > > > > > > >
> > > > > > > > > solution
> > > > > > > > > > > > makes
> > > > > > > > > > > > > all internal diagnostic information available in a readable
> > > >
> > > > form
> > > > > > > > > > (with
> > > > > > > > > > > > all
> > > > > > > > > > > > > possible filters and projections) without using any other
> > > > > > > > >
> > > > > > > > > internal or
> > > > > > > > > > > > > external tools. My proposal is to create similar system
> >
> > views
> > > > in
> > > > > > > > > > > Ignite.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I implement working prototype (PR: [1]). It contains views:
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_SYSTEM_VIEWS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Registered system views
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_INSTANCE
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ignite instance
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_JVM_THREADS
> > > > > > > > > > > > >
> > > > > > > > > > > > > JVM threads
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_JVM_RUNTIME
> > > > > > > > > > > > >
> > > > > > > > > > > > > JVM runtime
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_JVM_OS
> > > > > > > > > > > > >
> > > > > > > > > > > > > JVM operating system
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_CACHES
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ignite caches
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_CACHE_CLUSTER_METRICS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ignite cache cluster metrics
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_CACHE_NODE_METRICS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Ignite cache node metrics
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_CACHE_GROUPS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cache groups
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_NODES
> > > > > > > > > > > > >
> > > > > > > > > > > > > Nodes in topology
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_NODE_HOSTS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Node hosts
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_NODE_ADDRESSES
> > > > > > > > > > > > >
> > > > > > > > > > > > > Node addresses
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_NODE_ATTRIBUTES
> > > > > > > > > > > > >
> > > > > > > > > > > > > Node attributes
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_NODE_METRICS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Node metrics
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_TRANSACTIONS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Active transactions
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_TRANSACTION_ENTRIES
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cache entries used by transaction
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_TASKS
> > > > > > > > > > > > >
> > > > > > > > > > > > > Active tasks
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_PART_ASSIGNMENT
> > > > > > > > > > > > >
> > > > > > > > > > > > > Partition assignment map
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_PART_ALLOCATION
> > > > > > > > > > > > >
> > > > > > > > > > > > > Partition allocation map
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > There are much more useful views can be implemented
> > >
> > > (executors
> > > > > > > > > > > > diagnostic,
> > > > > > > > > > > > > SPIs diagnostic, etc).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Some usage examples:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cache groups and their partitions, which used by
> >
> > transaction
> > > > more
> > > > > > > > > > than
> > > > > > > > > > > 5
> > > > > > > > > > > > > minutes long:
> > > > > > > > > > > > >
> > > > > > > > > > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION, count(*)
> >
> > AS
> > > > > > > > > > > ENTITIES_CNT
> > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON
> > >
> > > t.XID
> > > > =
> > > > > > > > > > > te.XID
> > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON te.CACHE_NAME =
> > > > >
> > > > > c.NAME
> > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON
> >
> > c.GROUP_ID
> > > =
> > > > > > > > > cg.ID
> > > > > > > > > > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5, NOW())
> > > > > > > > > > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Average CPU load on server nodes grouped by operating
> >
> > system:
> > > > > > > > > > > > >
> > > > > > > > > > > > > SELECT na.VALUE, COUNT(n.ID), AVG(nm.AVG_CPU_LOAD)
> > >
> > > AVG_CPU_LOAD
> > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES na ON
> > >
> > > na.NODE_ID
> > > > > =
> > > > > > > > > > n.ID
> > > > > > > > > > > > AND
> > > > > > > > > > > > > na.NAME = 'os.name'
> > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON
> >
> > nm.NODE_ID
> > > =
> > > > > > > > > n.ID
> > > > > > > > > > > > > WHERE n.IS_CLIENT = false
> > > > > > > > > > > > > GROUP BY na.VALUE
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Top 5 nodes by puts to cache ‘cache’:
> > > > > > > > > > > > >
> > > > > > > > > > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > > > > > > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > > > > > > > > > WHERE cm.CACHE_NAME = 'cache'
> > > > > > > > > > > > > ORDER BY cm.CACHE_PUTS DESC
> > > > > > > > > > > > > LIMIT 5
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Does this implementation interesting to someone else? Maybe
> > >
> > > any
> > > > > > > > > views
> > > > > > > > > > > are
> > > > > > > > > > > > > redundant? Which additional first-priority views must be
> > > > > > > > >
> > > > > > > > > implemented?
> > > > > > > > > > > Any
> > > > > > > > > > > > > other thoughts or proposal?
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/3413
> > > > > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > >
> > > > > >
>
>

signature.asc (499 bytes) Download Attachment

Alexey Plekhanov

Re: Ignite diagnostic (SQL system views)

Documentation for views released in 2.7 is here:
https://apacheignite-sql.readme.io/docs/system-views

пт, 18 окт. 2019 г. в 17:24, Nikolay Izhikov <[hidden email]>:

> Denis, AFAIK we doesn't have documentation for the SQL System Views
> existing in Ignite.
>
> I have plans to write a documentation about metrics and syste views that
> will cover SQL System View.
> It will be available till 2.8 release.
>
>
>
> В Пт, 18/10/2019 в 07:16 -0700, Denis Magda пишет:
> > Alex, Igniters,
> >
> > Who of us was contributing this feature? I don’t see any documentation,
> not
> > clear how the users are expected to benefit from the capability and how
> > everybody will be aware of the feature existence.
> >
> > We need to close the gap and spread the word.
> >
> > Denis
> >
> > On Thursday, October 17, 2019, Alex Plehanov <[hidden email]>
> > wrote:
> >
> > > Denis,
> > >
> > > Views engine and some views were released in AI 2.7.
> > > In 2.8 they will be moved to the new engine and new views will be
> added (as
> > > part of IEP-35)
> > >
> > >
> > > пт, 18 окт. 2019 г. в 00:50, Denis Magda <[hidden email]>:
> > >
> > > > Anton, Maxim,
> > > >
> > > > Are we planning to release the views as part of 2.8? Don't see them
> > >
> > > listed
> > > > in the important features section:
> > > >
> > > >
> https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#
> > >
> > > ApacheIgnite2.8-Themostimportantreleasetasks
> > > >
> > > > -
> > > > Denis
> > > >
> > > >
> > > > On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <
> > >
> > > [hidden email]
> > > > >
> > > >
> > > > wrote:
> > > >
> > > > > Vova,
> > > > >
> > > > > Could you confirm
> https://issues.apache.org/jira/browse/IGNITE-7527
> > > >
> > > > ready
> > > > > to be merged?
> > > > >
> > > > > On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <
> > >
> > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > I would start with NODES and NODE_ATTRIBUTES as the most simple
> > >
> > > thing.
> > > > > >
> > > > > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <[hidden email]>
> > > >
> > > > wrote:
> > > > > >
> > > > > > > Alex P, sounds like a good plan for me.
> > > > > > >
> > > > > > > Vladimir, do you have any suggestions or corrections?
> > > > > > >
> > > > > > > —
> > > > > > > Denis
> > > > > > >
> > > > > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <
> > > >
> > > > [hidden email]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > The views engine and the first view are almost ready to merge
> > > >
> > > > (review
> > > > > > > > comments are resolved). Which views should we take next? My
> > > >
> > > > proposal
> > > > > -
> > > > > > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and
> > > >
> > > > NODE_ADDRESSES,
> > > > > > > since
> > > > > > > > these views are clear and all topology data available on each
> > >
> > > node.
> > > > > > > > Any objections?
> > > > > > > >
> > > > > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <
> > >
> > > [hidden email]
> > > > > :
> > > > > > > >
> > > > > > > > > Anton, Vladimir, I've made some fixes. There is only one
> view
> > >
> > > left
> > > > > and
> > > > > > > > > it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > > > > > > >
> > > > > > > > > High level design of solution:
> > > > > > > > > When IgniteH2Indexing is starting, it create and start
> > > > > > > > > new GridH2SysViewProcessor, which create and register in
> H2 (via
> > > >
> > > > its
> > > > > > own
> > > > > > > > > table engine) all implementations of system views. Each
> system
> > > >
> > > > view
> > > > > > > > > implementation extends base abstract class GridH2SysView.
> View
> > > > > > > > > implementation describes columns, their types and indexes
> in
> > > > > >
> > > > > > constructor
> > > > > > > > > and must override method getRows for data retrieval (this
> method
> > > > > >
> > > > > > called
> > > > > > > by
> > > > > > > > > H2-compatible table and index implementations for ignite
> system
> > > > > >
> > > > > > views).
> > > > > > > > > Almost no fixes to existing parsing engine was made,
> except some
> > > > > >
> > > > > > places,
> > > > > > > > > where GridH2Table instance was expected, but for system
> views
> > > >
> > > > there
> > > > > is
> > > > > > > > > another class.
> > > > > > > > >
> > > > > > > > > New PR: [1]. Please have a look.
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/ignite/pull/3433
> > > > > > > > >
> > > > > > > > > 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> > > > >
> > > > > [hidden email]
> > > > > > > :
> > > > > > > > >
> > > > > > > > > > I've created IEP-13 [1] to cover all cases.
> > > > > > > > > > Feel free to create issues.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > >
> > > > > > > action?pageId=75962769
> > > > > > > > > >
> > > > > > > > > > On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > > > > >
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Let's start with a single and the most simple view,
> e.g.
> > > > > > > > > > > LOCAL_TRANSACTIONS. We will review and merge it along
> with
> > > > >
> > > > > necessary
> > > > > > > > > > > infrastructure. Then will handle the rest view in
> separate
> > > >
> > > > tickets
> > > > > > and
> > > > > > > > > > > separate focused discussions.
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > > > > >
> > > > > > > [hidden email]
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > 1) It’s not a principal point, I can change schema.
> The
> > > > > > > > > > >
> > > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > was used because it’s already exists and usually
> used for
> > > > >
> > > > > metadata
> > > > > > > > > > tables
> > > > > > > > > > > > and views. Your proposal is to use schema “IGNITE”,
> am I
> > > > >
> > > > > understand
> > > > > > > > > > you
> > > > > > > > > > > > right? BTW, for now, we can’t query another (H2)
> meta tables
> > > >
> > > > from
> > > > > > the
> > > > > > > > > > > > INFORMATION_SCHEMA, so, “Ignite system views” is only
> > >
> > > available
> > > > > > views
> > > > > > > > > > to
> > > > > > > > > > > > query from this schema.
> > > > > > > > > > > > 2) Exactly for this reason the IGNITE_INSTANCE view
> is
> > >
> > > useful:
> > > > to
> > > > > > > > > > > determine
> > > > > > > > > > > > which node we are connected to.
> > > > > > > > > > > > 3) As the first phase, in my opinion, local views
> will be
> > > >
> > > > enough.
> > > > > > > > > > > > Performance and caching of distributed views should
> be
> > > >
> > > > discussed
> > > > > at
> > > > > > > > > > next
> > > > > > > > > > > > phases, when distributed views implementation will be
> > >
> > > planned.
> > > > In
> > > > > > > > > > current
> > > > > > > > > > > > implementation I tried to use indexing for local
> views
> > >
> > > wherever
> > > > > > it’s
> > > > > > > > > > > > possible.
> > > > > > > > > > > > 4) I don’t think, that JVM info is more critical
> information
> > > > >
> > > > > than,
> > > > > > > for
> > > > > > > > > > > > example, caches or nodes information. When
> authorization
> > > > > >
> > > > > > capabilities
> > > > > > > > > > > > planned to implement?
> > > > > > > > > > > >
> > > > > > > > > > > > About local data: yes, we can rename all currently
> > >
> > > implemented
> > > > > > views
> > > > > > > > > > for
> > > > > > > > > > > > the local node data as LOCAL_..., and create
> (someday) new
> > > >
> > > > whole
> > > > > > > > > > cluster
> > > > > > > > > > > > views (which use distributed requests) without
> prefix or, for
> > > > > > >
> > > > > > > example,
> > > > > > > > > > > with
> > > > > > > > > > > > CLUSTER_ prefix. But some views can show all cluster
> > > >
> > > > information
> > > > > > > using
> > > > > > > > > > > only
> > > > > > > > > > > > local node data, without distributed requests (for
> example
> > > > > > > > > > > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > > > > >
> > > > > > IGNITE_PART_ALLOCATION,
> > > > > > > > > > > > IGNITE_NODES, etc). Are they local or cluster views
> in this
> > > > > >
> > > > > > concept?
> > > > > > > > > > > Which
> > > > > > > > > > > > prefix should be used? And what about caches? Are
> they local
> > >
> > > or
> > > > > > > > > > cluster?
> > > > > > > > > > > On
> > > > > > > > > > > > local node we can see cluster wide caches
> (replicated and
> > > > > > >
> > > > > > > distributed)
> > > > > > > > > > > and
> > > > > > > > > > > > caches for current node only. Local caches list may
> differ
> > >
> > > from
> > > > > > node
> > > > > > > > > > to
> > > > > > > > > > > > node. Which prefix should be used for this view? And
> one
> > >
> > > more,
> > > > > > there
> > > > > > > > > > is
> > > > > > > > > > > no
> > > > > > > > > > > > sense for some views to make them cluster wide (for
> example
> > > > > > > > > > > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE
> without
> > > > > >
> > > > > > creating
> > > > > > > > > > > > INSTANCE view?
> > > > > > > > > > > >
> > > > > > > > > > > > So, next steps: split PR, change schema name
> (IGNITE?),
> > >
> > > change
> > > > > view
> > > > > > > > > > name
> > > > > > > > > > > > for caches (CACHES, LOCAL_CACHES?)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <
> > > >
> > > > [hidden email]
> > > > > > :
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > > >
> > > > > > > > > > > > > System views could be extremely valuable addition
> for
> > >
> > > Ignite.
> > > > > > > > > > Ideally,
> > > > > > > > > > > > user
> > > > > > > > > > > > > should be able to monitor and manage state of the
> whole
> > > >
> > > > cluster
> > > > > > > > > > with a
> > > > > > > > > > > > > single SQL command line. We have plans to
> implement it for a
> > > > >
> > > > > very
> > > > > > > > > > long
> > > > > > > > > > > > > time. However, this is very sensitive task which
> should
> > >
> > > take a
> > > > > lot
> > > > > > > > > > of
> > > > > > > > > > > > > moving pieces in count, such as usability,
> consistency,
> > > > > >
> > > > > > performance,
> > > > > > > > > > > > > security, etc..
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let me point several major concerns I see at the
> moment:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1) Usability: INFORMATION_SCHEMA
> > > > > > > > > > > > > This schema is part of SQL ANSI standard. When
> creating
> > >
> > > system
> > > > > > > > > > views,
> > > > > > > > > > > > some
> > > > > > > > > > > > > vendors prefer to store them in completely
> different
> > > >
> > > > predefined
> > > > > > > > > > schema
> > > > > > > > > > > > > (Oracle, MS SQL). Others prefer to keep them in
> > > > >
> > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > > directly. Both approaches could work. However, the
> latter
> > > >
> > > > breaks
> > > > > > > > > > > > separation
> > > > > > > > > > > > > of concerns - we store typical metadata near to
> possibly
> > > > >
> > > > > sensitive
> > > > > > > > > > > system
> > > > > > > > > > > > > data. Also it makes security management more
> complex -
> > >
> > > system
> > > > > data
> > > > > > > > > > is
> > > > > > > > > > > > very
> > > > > > > > > > > > > sensitive, and now we cannot simply grant access
> > > > > > > > > >
> > > > > > > > > > INFORMATIONAL_SCHEMA
> > > > > > > > > > > to
> > > > > > > > > > > > > user. Instead, we have to grant that access on
> per-view
> > >
> > > basis.
> > > > > For
> > > > > > > > > > this
> > > > > > > > > > > > > reason my preference is to store system tables in
> separate
> > > > >
> > > > > schema,
> > > > > > > > > > not
> > > > > > > > > > > in
> > > > > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2) Consistency: local data
> > > > > > > > > > > > > One of implemented view GridH2SysViewImplInstance.
> Normally
> > > >
> > > > SQL
> > > > > > > > > > users
> > > > > > > > > > > > > communicate with Ignite through JDBC/ODBC drivers.
> These
> > > >
> > > > drivers
> > > > > > are
> > > > > > > > > > > > > connected to a single node, typically client node.
> Moreover,
> > > >
> > > > we
> > > > > > will
> > > > > > > > > > > > > introduce high-availability feature when drivers
> were able
> > >
> > > to
> > > > > > > > > > connect
> > > > > > > > > > > to
> > > > > > > > > > > > > any address from a predefined list. It renders
> this view
> > > > >
> > > > > useless,
> > > > > > as
> > > > > > > > > > > you
> > > > > > > > > > > > do
> > > > > > > > > > > > > not know which node you connected to. Also,
> local-only data
> > > > >
> > > > > cannot
> > > > > > > > > > be
> > > > > > > > > > > > > joined in general case - you will receive
> different results
> > >
> > > on
> > > > > > > > > > > different
> > > > > > > > > > > > > nodes. The same goes for transactions, JVM info,
> etc.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3) Performance
> > > > > > > > > > > > > Suppose we fixed consistency of transactions and
> now this
> > >
> > > view
> > > > > > shows
> > > > > > > > > > > > > transactions in the whole cluster with possibility
> to filter
> > > > >
> > > > > them
> > > > > > by
> > > > > > > > > > > > nodes
> > > > > > > > > > > > > - this is what user would expect out of the box.
> Another
> > > >
> > > > problem
> > > > > > > > > > > appears
> > > > > > > > > > > > > then - performance. How would we collect necessary
> data? How
> > > > >
> > > > > would
> > > > > > > > > > we
> > > > > > > > > > > > > handle joins, when particular view could be
> scanned multiple
> > > > >
> > > > > times
> > > > > > > > > > > during
> > > > > > > > > > > > > query execution? How we achieve sensible
> consistency? Most
> > > > > >
> > > > > > probably
> > > > > > > > > > we
> > > > > > > > > > > > > would collect remote data once when query is
> started, cache
> > >
> > > it
> > > > > > > > > > somehow
> > > > > > > > > > > on
> > > > > > > > > > > > > query session level, and then re-use during joins.
> But
> > >
> > > again,
> > > > > this
> > > > > > > > > > > should
> > > > > > > > > > > > > be discussed separately.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 4) Security: JVM info
> > > > > > > > > > > > > We should define clear boundaries of what info is
> exposed.
> > >
> > > JVM
> > > > > > data
> > > > > > > > > > > along
> > > > > > > > > > > > > with running threads is critically sensitive
> information. We
> > > > > >
> > > > > > should
> > > > > > > > > > not
> > > > > > > > > > > > > expose it until we have authorization capabilities.
> > > > > > > > > > > > >
> > > > > > > > > > > > > In order to start moving this code from prototype
> to
> > > >
> > > > production
> > > > > > > > > > state
> > > > > > > > > > > we
> > > > > > > > > > > > > should start with the most simple and consistent
> views. E.g.
> > > > > > > > > > > >
> > > > > > > > > > > > IGNITE_CACHES.
> > > > > > > > > > > > > Let's move it to a separate PR, review
> infrastructure code,
> > > > >
> > > > > review
> > > > > > > > > > view
> > > > > > > > > > > > > implementation, agree on proper naming and
> placement, and
> > > >
> > > > merge
> > > > > > it.
> > > > > > > > > > > Then
> > > > > > > > > > > > > each and every view (or group of related views)
> should be
> > > > > >
> > > > > > discussed
> > > > > > > > > > and
> > > > > > > > > > > > > reviewed separately.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As far as node-local stuff, may be we should move
> it to a
> > > > >
> > > > > separate
> > > > > > > > > > > > schema,
> > > > > > > > > > > > > or mark with special prefix. E.g.
> "IGNITE.TRANSACTIONS" -
> > >
> > > all
> > > > > > > > > > > > transactions
> > > > > > > > > > > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" -
> transactions
> > >
> > > on
> > > > > the
> > > > > > > > > > local
> > > > > > > > > > > > > node. In this case we will be able to merge
> "local" stuff
> > > > >
> > > > > shortly,
> > > > > > > > > > and
> > > > > > > > > > > > > implement more complex but at the same time much
> more useful
> > > > > > > > > > >
> > > > > > > > > > > distributed
> > > > > > > > > > > > > stuff later on.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Makes sense?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Vladimir.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > > > > > > > > >
> > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello, Igniters!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > For Ignite diagnostic usually it’s helpful to
> get some
> > >
> > > Ignite
> > > > > > > > > > > internals
> > > > > > > > > > > > > > information. But currently, in my opinion, there
> are no
> > > > > >
> > > > > > convenient
> > > > > > > > > > > > tools
> > > > > > > > > > > > > > for this purpose:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > · Some issues can be solved by analyzing
> log files.
> > > >
> > > > Log
> > > > > > > > > > files
> > > > > > > > > > > > are
> > > > > > > > > > > > > > useful for dumps, but sometimes they are
> difficult to read.
> > > > >
> > > > > Also
> > > > > > > > > > > > > > interesting metrics can’t be received runtime by
> request,
> > >
> > > we
> > > > > need
> > > > > > > > > > to
> > > > > > > > > > > > wait
> > > > > > > > > > > > > > until Ignite will write these metrics by timeout
> or other
> > > > >
> > > > > events.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > · JMX is useful for scalar metrics.
> Complex and
> > >
> > > table
> > > > > data
> > > > > > > > > > can
> > > > > > > > > > > > > also
> > > > > > > > > > > > > > be received, but it’s difficult to read, filter
> and sort
> > >
> > > them
> > > > > > > > > > without
> > > > > > > > > > > > > > processing by specialized external tools. For
> most
> > >
> > > frequently
> > > > > > used
> > > > > > > > > > > > cases
> > > > > > > > > > > > > > almost duplicating metrics are created to show
> data in an
> > > > > > > > > > >
> > > > > > > > > > > easy-to-read
> > > > > > > > > > > > > > form.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > · Web-console is able to show table and
> complex
> > >
> > > data.
> > > > > > > > > > Perhaps,
> > > > > > > > > > > > > > someday web-console will contain all necessary
> dashboards
> > > >
> > > > for
> > > > > > > > > > most
> > > > > > > > > > > > > problem
> > > > > > > > > > > > > > investigation, but some non-trivial queries will
> not be
> > > >
> > > > covered
> > > > > > > > > > > anyway.
> > > > > > > > > > > > > > Also web-console needs additional infrastructure
> to work.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > · External “home-made” tools can be used
> for
> > > >
> > > > non-trivial
> > > > > > > > > > > cases.
> > > > > > > > > > > > > They
> > > > > > > > > > > > > > cover highly specialized cases and usually can’t
> be used as
> > > > > > > > > >
> > > > > > > > > > general
> > > > > > > > > > > > > purpose
> > > > > > > > > > > > > > tools.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Sometimes we are forced to use more than one
> tool and join
> > > >
> > > > data
> > > > > > by
> > > > > > > > > > > > hands
> > > > > > > > > > > > > > (for example, current thread dump and data from
> logs).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Often RDBMS for diagnostic purposes provides
> system views
> > > >
> > > > (for
> > > > > > > > > > > example,
> > > > > > > > > > > > > > DBA_% and V$% in Oracle), which can be queried
> by SQL. This
> > > > > > > > > >
> > > > > > > > > > solution
> > > > > > > > > > > > > makes
> > > > > > > > > > > > > > all internal diagnostic information available in
> a readable
> > > > >
> > > > > form
> > > > > > > > > > > (with
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > possible filters and projections) without using
> any other
> > > > > > > > > >
> > > > > > > > > > internal or
> > > > > > > > > > > > > > external tools. My proposal is to create similar
> system
> > >
> > > views
> > > > > in
> > > > > > > > > > > > Ignite.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I implement working prototype (PR: [1]). It
> contains views:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_SYSTEM_VIEWS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Registered system views
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_INSTANCE
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ignite instance
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_JVM_THREADS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > JVM threads
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_JVM_RUNTIME
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > JVM runtime
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_JVM_OS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > JVM operating system
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_CACHES
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ignite caches
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_CACHE_CLUSTER_METRICS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ignite cache cluster metrics
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_CACHE_NODE_METRICS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Ignite cache node metrics
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_CACHE_GROUPS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cache groups
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_NODES
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Nodes in topology
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_NODE_HOSTS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Node hosts
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_NODE_ADDRESSES
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Node addresses
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_NODE_ATTRIBUTES
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Node attributes
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_NODE_METRICS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Node metrics
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_TRANSACTIONS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Active transactions
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_TRANSACTION_ENTRIES
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cache entries used by transaction
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_TASKS
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Active tasks
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_PART_ASSIGNMENT
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Partition assignment map
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IGNITE_PART_ALLOCATION
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Partition allocation map
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > There are much more useful views can be
> implemented
> > > >
> > > > (executors
> > > > > > > > > > > > > diagnostic,
> > > > > > > > > > > > > > SPIs diagnostic, etc).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Some usage examples:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cache groups and their partitions, which used by
> > >
> > > transaction
> > > > > more
> > > > > > > > > > > than
> > > > > > > > > > > > 5
> > > > > > > > > > > > > > minutes long:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > SELECT cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION,
> count(*)
> > >
> > > AS
> > > > > > > > > > > > ENTITIES_CNT
> > > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > > > > > > > > > > JOIN
> INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON
> > > >
> > > > t.XID
> > > > > =
> > > > > > > > > > > > te.XID
> > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON
> te.CACHE_NAME =
> > > > > >
> > > > > > c.NAME
> > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg ON
> > >
> > > c.GROUP_ID
> > > > =
> > > > > > > > > > cg.ID
> > > > > > > > > > > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5,
> NOW())
> > > > > > > > > > > > > > GROUP BY cg.CACHE_OR_GROUP_NAME, te.KEY_PARTITION
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Average CPU load on server nodes grouped by
> operating
> > >
> > > system:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > SELECT na.VALUE, COUNT(n.ID),
> AVG(nm.AVG_CPU_LOAD)
> > > >
> > > > AVG_CPU_LOAD
> > > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES
> na ON
> > > >
> > > > na.NODE_ID
> > > > > > =
> > > > > > > > > > > n.ID
> > > > > > > > > > > > > AND
> > > > > > > > > > > > > > na.NAME = 'os.name'
> > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm ON
> > >
> > > nm.NODE_ID
> > > > =
> > > > > > > > > > n.ID
> > > > > > > > > > > > > > WHERE n.IS_CLIENT = false
> > > > > > > > > > > > > > GROUP BY na.VALUE
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Top 5 nodes by puts to cache ‘cache’:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > > > > > > > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > > > > > > > > > > WHERE cm.CACHE_NAME = 'cache'
> > > > > > > > > > > > > > ORDER BY cm.CACHE_PUTS DESC
> > > > > > > > > > > > > > LIMIT 5
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Does this implementation interesting to someone
> else? Maybe
> > > >
> > > > any
> > > > > > > > > > views
> > > > > > > > > > > > are
> > > > > > > > > > > > > > redundant? Which additional first-priority views
> must be
> > > > > > > > > >
> > > > > > > > > > implemented?
> > > > > > > > > > > > Any
> > > > > > > > > > > > > > other thoughts or proposal?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/3413
> > > > > > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > >
> > > > > > >
> >
> >
>

dmagda

Re: Ignite diagnostic (SQL system views)

Alex, Nikolay, thanks! Forgive me for my poor googling skills ;)

-
Denis

On Fri, Oct 18, 2019 at 7:26 AM Alex Plehanov <[hidden email]>
wrote:

> Documentation for views released in 2.7 is here:
> https://apacheignite-sql.readme.io/docs/system-views
>
> пт, 18 окт. 2019 г. в 17:24, Nikolay Izhikov <[hidden email]>:
>
> > Denis, AFAIK we doesn't have documentation for the SQL System Views
> > existing in Ignite.
> >
> > I have plans to write a documentation about metrics and syste views that
> > will cover SQL System View.
> > It will be available till 2.8 release.
> >
> >
> >
> > В Пт, 18/10/2019 в 07:16 -0700, Denis Magda пишет:
> > > Alex, Igniters,
> > >
> > > Who of us was contributing this feature? I don’t see any documentation,
> > not
> > > clear how the users are expected to benefit from the capability and how
> > > everybody will be aware of the feature existence.
> > >
> > > We need to close the gap and spread the word.
> > >
> > > Denis
> > >
> > > On Thursday, October 17, 2019, Alex Plehanov <[hidden email]>
> > > wrote:
> > >
> > > > Denis,
> > > >
> > > > Views engine and some views were released in AI 2.7.
> > > > In 2.8 they will be moved to the new engine and new views will be
> > added (as
> > > > part of IEP-35)
> > > >
> > > >
> > > > пт, 18 окт. 2019 г. в 00:50, Denis Magda <[hidden email]>:
> > > >
> > > > > Anton, Maxim,
> > > > >
> > > > > Are we planning to release the views as part of 2.8? Don't see them
> > > >
> > > > listed
> > > > > in the important features section:
> > > > >
> > > > >
> > https://cwiki.apache.org/confluence/display/IGNITE/Apache+Ignite+2.8#
> > > >
> > > > ApacheIgnite2.8-Themostimportantreleasetasks
> > > > >
> > > > > -
> > > > > Denis
> > > > >
> > > > >
> > > > > On Wed, Feb 14, 2018 at 1:49 AM Anton Vinogradov <
> > > >
> > > > [hidden email]
> > > > > >
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Vova,
> > > > > >
> > > > > > Could you confirm
> > https://issues.apache.org/jira/browse/IGNITE-7527
> > > > >
> > > > > ready
> > > > > > to be merged?
> > > > > >
> > > > > > On Wed, Feb 14, 2018 at 12:01 PM, Vladimir Ozerov <
> > > >
> > > > [hidden email]>
> > > > > > wrote:
> > > > > >
> > > > > > > I would start with NODES and NODE_ATTRIBUTES as the most simple
> > > >
> > > > thing.
> > > > > > >
> > > > > > > On Tue, Feb 13, 2018 at 4:10 AM, Denis Magda <
> [hidden email]>
> > > > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Alex P, sounds like a good plan for me.
> > > > > > > >
> > > > > > > > Vladimir, do you have any suggestions or corrections?
> > > > > > > >
> > > > > > > > —
> > > > > > > > Denis
> > > > > > > >
> > > > > > > > > On Feb 12, 2018, at 4:57 AM, Alex Plehanov <
> > > > >
> > > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > The views engine and the first view are almost ready to
> merge
> > > > >
> > > > > (review
> > > > > > > > > comments are resolved). Which views should we take next? My
> > > > >
> > > > > proposal
> > > > > > -
> > > > > > > > > NODES, NODE_ATTRIBUTES, NODE_METRICS, NODE_HOSTS and
> > > > >
> > > > > NODE_ADDRESSES,
> > > > > > > > since
> > > > > > > > > these views are clear and all topology data available on
> each
> > > >
> > > > node.
> > > > > > > > > Any objections?
> > > > > > > > >
> > > > > > > > > 2018-01-25 16:27 GMT+03:00 Alex Plehanov <
> > > >
> > > > [hidden email]
> > > > > > :
> > > > > > > > >
> > > > > > > > > > Anton, Vladimir, I've made some fixes. There is only one
> > view
> > > >
> > > > left
> > > > > > and
> > > > > > > > > > it's renamed to 'IGNITE.LOCAL_TRANSACTIONS'.
> > > > > > > > > >
> > > > > > > > > > High level design of solution:
> > > > > > > > > > When IgniteH2Indexing is starting, it create and start
> > > > > > > > > > new GridH2SysViewProcessor, which create and register in
> > H2 (via
> > > > >
> > > > > its
> > > > > > > own
> > > > > > > > > > table engine) all implementations of system views. Each
> > system
> > > > >
> > > > > view
> > > > > > > > > > implementation extends base abstract class GridH2SysView.
> > View
> > > > > > > > > > implementation describes columns, their types and indexes
> > in
> > > > > > >
> > > > > > > constructor
> > > > > > > > > > and must override method getRows for data retrieval (this
> > method
> > > > > > >
> > > > > > > called
> > > > > > > > by
> > > > > > > > > > H2-compatible table and index implementations for ignite
> > system
> > > > > > >
> > > > > > > views).
> > > > > > > > > > Almost no fixes to existing parsing engine was made,
> > except some
> > > > > > >
> > > > > > > places,
> > > > > > > > > > where GridH2Table instance was expected, but for system
> > views
> > > > >
> > > > > there
> > > > > > is
> > > > > > > > > > another class.
> > > > > > > > > >
> > > > > > > > > > New PR: [1]. Please have a look.
> > > > > > > > > >
> > > > > > > > > > [1] https://github.com/apache/ignite/pull/3433
> > > > > > > > > >
> > > > > > > > > > 2018-01-24 19:12 GMT+03:00 Anton Vinogradov <
> > > > > >
> > > > > > [hidden email]
> > > > > > > > :
> > > > > > > > > >
> > > > > > > > > > > I've created IEP-13 [1] to cover all cases.
> > > > > > > > > > > Feel free to create issues.
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.
> > > > > > > >
> > > > > > > > action?pageId=75962769
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Jan 24, 2018 at 6:10 PM, Vladimir Ozerov <
> > > > > > >
> > > > > > > [hidden email]
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Let's start with a single and the most simple view,
> > e.g.
> > > > > > > > > > > > LOCAL_TRANSACTIONS. We will review and merge it along
> > with
> > > > > >
> > > > > > necessary
> > > > > > > > > > > > infrastructure. Then will handle the rest view in
> > separate
> > > > >
> > > > > tickets
> > > > > > > and
> > > > > > > > > > > > separate focused discussions.
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Jan 24, 2018 at 5:29 PM, Alex Plehanov <
> > > > > > > >
> > > > > > > > [hidden email]
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > 1) It’s not a principal point, I can change schema.
> > The
> > > > > > > > > > > >
> > > > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > > was used because it’s already exists and usually
> > used for
> > > > > >
> > > > > > metadata
> > > > > > > > > > > tables
> > > > > > > > > > > > > and views. Your proposal is to use schema “IGNITE”,
> > am I
> > > > > >
> > > > > > understand
> > > > > > > > > > > you
> > > > > > > > > > > > > right? BTW, for now, we can’t query another (H2)
> > meta tables
> > > > >
> > > > > from
> > > > > > > the
> > > > > > > > > > > > > INFORMATION_SCHEMA, so, “Ignite system views” is
> only
> > > >
> > > > available
> > > > > > > views
> > > > > > > > > > > to
> > > > > > > > > > > > > query from this schema.
> > > > > > > > > > > > > 2) Exactly for this reason the IGNITE_INSTANCE view
> > is
> > > >
> > > > useful:
> > > > > to
> > > > > > > > > > > > determine
> > > > > > > > > > > > > which node we are connected to.
> > > > > > > > > > > > > 3) As the first phase, in my opinion, local views
> > will be
> > > > >
> > > > > enough.
> > > > > > > > > > > > > Performance and caching of distributed views should
> > be
> > > > >
> > > > > discussed
> > > > > > at
> > > > > > > > > > > next
> > > > > > > > > > > > > phases, when distributed views implementation will
> be
> > > >
> > > > planned.
> > > > > In
> > > > > > > > > > > current
> > > > > > > > > > > > > implementation I tried to use indexing for local
> > views
> > > >
> > > > wherever
> > > > > > > it’s
> > > > > > > > > > > > > possible.
> > > > > > > > > > > > > 4) I don’t think, that JVM info is more critical
> > information
> > > > > >
> > > > > > than,
> > > > > > > > for
> > > > > > > > > > > > > example, caches or nodes information. When
> > authorization
> > > > > > >
> > > > > > > capabilities
> > > > > > > > > > > > > planned to implement?
> > > > > > > > > > > > >
> > > > > > > > > > > > > About local data: yes, we can rename all currently
> > > >
> > > > implemented
> > > > > > > views
> > > > > > > > > > > for
> > > > > > > > > > > > > the local node data as LOCAL_..., and create
> > (someday) new
> > > > >
> > > > > whole
> > > > > > > > > > > cluster
> > > > > > > > > > > > > views (which use distributed requests) without
> > prefix or, for
> > > > > > > >
> > > > > > > > example,
> > > > > > > > > > > > with
> > > > > > > > > > > > > CLUSTER_ prefix. But some views can show all
> cluster
> > > > >
> > > > > information
> > > > > > > > using
> > > > > > > > > > > > only
> > > > > > > > > > > > > local node data, without distributed requests (for
> > example
> > > > > > > > > > > > > IGNITE_NODE_METRICS, IGNITE_PART_ASSIGNMENT,
> > > > > > >
> > > > > > > IGNITE_PART_ALLOCATION,
> > > > > > > > > > > > > IGNITE_NODES, etc). Are they local or cluster views
> > in this
> > > > > > >
> > > > > > > concept?
> > > > > > > > > > > > Which
> > > > > > > > > > > > > prefix should be used? And what about caches? Are
> > they local
> > > >
> > > > or
> > > > > > > > > > > cluster?
> > > > > > > > > > > > On
> > > > > > > > > > > > > local node we can see cluster wide caches
> > (replicated and
> > > > > > > >
> > > > > > > > distributed)
> > > > > > > > > > > > and
> > > > > > > > > > > > > caches for current node only. Local caches list may
> > differ
> > > >
> > > > from
> > > > > > > node
> > > > > > > > > > > to
> > > > > > > > > > > > > node. Which prefix should be used for this view?
> And
> > one
> > > >
> > > > more,
> > > > > > > there
> > > > > > > > > > > is
> > > > > > > > > > > > no
> > > > > > > > > > > > > sense for some views to make them cluster wide (for
> > example
> > > > > > > > > > > > > INGNITE_INSTANCE). Should we name it LOCAL_INSTANCE
> > without
> > > > > > >
> > > > > > > creating
> > > > > > > > > > > > > INSTANCE view?
> > > > > > > > > > > > >
> > > > > > > > > > > > > So, next steps: split PR, change schema name
> > (IGNITE?),
> > > >
> > > > change
> > > > > > view
> > > > > > > > > > > name
> > > > > > > > > > > > > for caches (CACHES, LOCAL_CACHES?)
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2018-01-24 13:03 GMT+03:00 Vladimir Ozerov <
> > > > >
> > > > > [hidden email]
> > > > > > > :
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Alex,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > System views could be extremely valuable addition
> > for
> > > >
> > > > Ignite.
> > > > > > > > > > > Ideally,
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > should be able to monitor and manage state of the
> > whole
> > > > >
> > > > > cluster
> > > > > > > > > > > with a
> > > > > > > > > > > > > > single SQL command line. We have plans to
> > implement it for a
> > > > > >
> > > > > > very
> > > > > > > > > > > long
> > > > > > > > > > > > > > time. However, this is very sensitive task which
> > should
> > > >
> > > > take a
> > > > > > lot
> > > > > > > > > > > of
> > > > > > > > > > > > > > moving pieces in count, such as usability,
> > consistency,
> > > > > > >
> > > > > > > performance,
> > > > > > > > > > > > > > security, etc..
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let me point several major concerns I see at the
> > moment:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1) Usability: INFORMATION_SCHEMA
> > > > > > > > > > > > > > This schema is part of SQL ANSI standard. When
> > creating
> > > >
> > > > system
> > > > > > > > > > > views,
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > vendors prefer to store them in completely
> > different
> > > > >
> > > > > predefined
> > > > > > > > > > > schema
> > > > > > > > > > > > > > (Oracle, MS SQL). Others prefer to keep them in
> > > > > >
> > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > > > directly. Both approaches could work. However,
> the
> > latter
> > > > >
> > > > > breaks
> > > > > > > > > > > > > separation
> > > > > > > > > > > > > > of concerns - we store typical metadata near to
> > possibly
> > > > > >
> > > > > > sensitive
> > > > > > > > > > > > system
> > > > > > > > > > > > > > data. Also it makes security management more
> > complex -
> > > >
> > > > system
> > > > > > data
> > > > > > > > > > > is
> > > > > > > > > > > > > very
> > > > > > > > > > > > > > sensitive, and now we cannot simply grant access
> > > > > > > > > > >
> > > > > > > > > > > INFORMATIONAL_SCHEMA
> > > > > > > > > > > > to
> > > > > > > > > > > > > > user. Instead, we have to grant that access on
> > per-view
> > > >
> > > > basis.
> > > > > > For
> > > > > > > > > > > this
> > > > > > > > > > > > > > reason my preference is to store system tables in
> > separate
> > > > > >
> > > > > > schema,
> > > > > > > > > > > not
> > > > > > > > > > > > in
> > > > > > > > > > > > > > INFORMATION_SCHEMA
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2) Consistency: local data
> > > > > > > > > > > > > > One of implemented view
> GridH2SysViewImplInstance.
> > Normally
> > > > >
> > > > > SQL
> > > > > > > > > > > users
> > > > > > > > > > > > > > communicate with Ignite through JDBC/ODBC
> drivers.
> > These
> > > > >
> > > > > drivers
> > > > > > > are
> > > > > > > > > > > > > > connected to a single node, typically client
> node.
> > Moreover,
> > > > >
> > > > > we
> > > > > > > will
> > > > > > > > > > > > > > introduce high-availability feature when drivers
> > were able
> > > >
> > > > to
> > > > > > > > > > > connect
> > > > > > > > > > > > to
> > > > > > > > > > > > > > any address from a predefined list. It renders
> > this view
> > > > > >
> > > > > > useless,
> > > > > > > as
> > > > > > > > > > > > you
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > not know which node you connected to. Also,
> > local-only data
> > > > > >
> > > > > > cannot
> > > > > > > > > > > be
> > > > > > > > > > > > > > joined in general case - you will receive
> > different results
> > > >
> > > > on
> > > > > > > > > > > > different
> > > > > > > > > > > > > > nodes. The same goes for transactions, JVM info,
> > etc.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 3) Performance
> > > > > > > > > > > > > > Suppose we fixed consistency of transactions and
> > now this
> > > >
> > > > view
> > > > > > > shows
> > > > > > > > > > > > > > transactions in the whole cluster with
> possibility
> > to filter
> > > > > >
> > > > > > them
> > > > > > > by
> > > > > > > > > > > > > nodes
> > > > > > > > > > > > > > - this is what user would expect out of the box.
> > Another
> > > > >
> > > > > problem
> > > > > > > > > > > > appears
> > > > > > > > > > > > > > then - performance. How would we collect
> necessary
> > data? How
> > > > > >
> > > > > > would
> > > > > > > > > > > we
> > > > > > > > > > > > > > handle joins, when particular view could be
> > scanned multiple
> > > > > >
> > > > > > times
> > > > > > > > > > > > during
> > > > > > > > > > > > > > query execution? How we achieve sensible
> > consistency? Most
> > > > > > >
> > > > > > > probably
> > > > > > > > > > > we
> > > > > > > > > > > > > > would collect remote data once when query is
> > started, cache
> > > >
> > > > it
> > > > > > > > > > > somehow
> > > > > > > > > > > > on
> > > > > > > > > > > > > > query session level, and then re-use during
> joins.
> > But
> > > >
> > > > again,
> > > > > > this
> > > > > > > > > > > > should
> > > > > > > > > > > > > > be discussed separately.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 4) Security: JVM info
> > > > > > > > > > > > > > We should define clear boundaries of what info is
> > exposed.
> > > >
> > > > JVM
> > > > > > > data
> > > > > > > > > > > > along
> > > > > > > > > > > > > > with running threads is critically sensitive
> > information. We
> > > > > > >
> > > > > > > should
> > > > > > > > > > > not
> > > > > > > > > > > > > > expose it until we have authorization
> capabilities.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In order to start moving this code from prototype
> > to
> > > > >
> > > > > production
> > > > > > > > > > > state
> > > > > > > > > > > > we
> > > > > > > > > > > > > > should start with the most simple and consistent
> > views. E.g.
> > > > > > > > > > > > >
> > > > > > > > > > > > > IGNITE_CACHES.
> > > > > > > > > > > > > > Let's move it to a separate PR, review
> > infrastructure code,
> > > > > >
> > > > > > review
> > > > > > > > > > > view
> > > > > > > > > > > > > > implementation, agree on proper naming and
> > placement, and
> > > > >
> > > > > merge
> > > > > > > it.
> > > > > > > > > > > > Then
> > > > > > > > > > > > > > each and every view (or group of related views)
> > should be
> > > > > > >
> > > > > > > discussed
> > > > > > > > > > > and
> > > > > > > > > > > > > > reviewed separately.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As far as node-local stuff, may be we should move
> > it to a
> > > > > >
> > > > > > separate
> > > > > > > > > > > > > schema,
> > > > > > > > > > > > > > or mark with special prefix. E.g.
> > "IGNITE.TRANSACTIONS" -
> > > >
> > > > all
> > > > > > > > > > > > > transactions
> > > > > > > > > > > > > > in the cluster, "IGNITE.LOCAL_TRANSACTIONS" -
> > transactions
> > > >
> > > > on
> > > > > > the
> > > > > > > > > > > local
> > > > > > > > > > > > > > node. In this case we will be able to merge
> > "local" stuff
> > > > > >
> > > > > > shortly,
> > > > > > > > > > > and
> > > > > > > > > > > > > > implement more complex but at the same time much
> > more useful
> > > > > > > > > > > >
> > > > > > > > > > > > distributed
> > > > > > > > > > > > > > stuff later on.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Makes sense?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Vladimir.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Jan 23, 2018 at 8:30 PM, Alex Plehanov <
> > > > > > > > > > > >
> > > > > > > > > > > > [hidden email]>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello, Igniters!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > For Ignite diagnostic usually it’s helpful to
> > get some
> > > >
> > > > Ignite
> > > > > > > > > > > > internals
> > > > > > > > > > > > > > > information. But currently, in my opinion,
> there
> > are no
> > > > > > >
> > > > > > > convenient
> > > > > > > > > > > > > tools
> > > > > > > > > > > > > > > for this purpose:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > · Some issues can be solved by analyzing
> > log files.
> > > > >
> > > > > Log
> > > > > > > > > > > files
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > useful for dumps, but sometimes they are
> > difficult to read.
> > > > > >
> > > > > > Also
> > > > > > > > > > > > > > > interesting metrics can’t be received runtime
> by
> > request,
> > > >
> > > > we
> > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > > wait
> > > > > > > > > > > > > > > until Ignite will write these metrics by
> timeout
> > or other
> > > > > >
> > > > > > events.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > · JMX is useful for scalar metrics.
> > Complex and
> > > >
> > > > table
> > > > > > data
> > > > > > > > > > > can
> > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > be received, but it’s difficult to read, filter
> > and sort
> > > >
> > > > them
> > > > > > > > > > > without
> > > > > > > > > > > > > > > processing by specialized external tools. For
> > most
> > > >
> > > > frequently
> > > > > > > used
> > > > > > > > > > > > > cases
> > > > > > > > > > > > > > > almost duplicating metrics are created to show
> > data in an
> > > > > > > > > > > >
> > > > > > > > > > > > easy-to-read
> > > > > > > > > > > > > > > form.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > · Web-console is able to show table and
> > complex
> > > >
> > > > data.
> > > > > > > > > > > Perhaps,
> > > > > > > > > > > > > > > someday web-console will contain all necessary
> > dashboards
> > > > >
> > > > > for
> > > > > > > > > > > most
> > > > > > > > > > > > > > problem
> > > > > > > > > > > > > > > investigation, but some non-trivial queries
> will
> > not be
> > > > >
> > > > > covered
> > > > > > > > > > > > anyway.
> > > > > > > > > > > > > > > Also web-console needs additional
> infrastructure
> > to work.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > · External “home-made” tools can be used
> > for
> > > > >
> > > > > non-trivial
> > > > > > > > > > > > cases.
> > > > > > > > > > > > > > They
> > > > > > > > > > > > > > > cover highly specialized cases and usually
> can’t
> > be used as
> > > > > > > > > > >
> > > > > > > > > > > general
> > > > > > > > > > > > > > purpose
> > > > > > > > > > > > > > > tools.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sometimes we are forced to use more than one
> > tool and join
> > > > >
> > > > > data
> > > > > > > by
> > > > > > > > > > > > > hands
> > > > > > > > > > > > > > > (for example, current thread dump and data from
> > logs).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Often RDBMS for diagnostic purposes provides
> > system views
> > > > >
> > > > > (for
> > > > > > > > > > > > example,
> > > > > > > > > > > > > > > DBA_% and V$% in Oracle), which can be queried
> > by SQL. This
> > > > > > > > > > >
> > > > > > > > > > > solution
> > > > > > > > > > > > > > makes
> > > > > > > > > > > > > > > all internal diagnostic information available
> in
> > a readable
> > > > > >
> > > > > > form
> > > > > > > > > > > > (with
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > possible filters and projections) without using
> > any other
> > > > > > > > > > >
> > > > > > > > > > > internal or
> > > > > > > > > > > > > > > external tools. My proposal is to create
> similar
> > system
> > > >
> > > > views
> > > > > > in
> > > > > > > > > > > > > Ignite.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I implement working prototype (PR: [1]). It
> > contains views:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_SYSTEM_VIEWS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Registered system views
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_INSTANCE
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ignite instance
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_JVM_THREADS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > JVM threads
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_JVM_RUNTIME
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > JVM runtime
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_JVM_OS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > JVM operating system
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_CACHES
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ignite caches
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_CACHE_CLUSTER_METRICS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ignite cache cluster metrics
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_CACHE_NODE_METRICS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Ignite cache node metrics
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_CACHE_GROUPS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cache groups
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_NODES
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Nodes in topology
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_NODE_HOSTS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Node hosts
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_NODE_ADDRESSES
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Node addresses
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_NODE_ATTRIBUTES
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Node attributes
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_NODE_METRICS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Node metrics
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_TRANSACTIONS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Active transactions
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_TRANSACTION_ENTRIES
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cache entries used by transaction
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_TASKS
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Active tasks
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_PART_ASSIGNMENT
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Partition assignment map
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > IGNITE_PART_ALLOCATION
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Partition allocation map
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > There are much more useful views can be
> > implemented
> > > > >
> > > > > (executors
> > > > > > > > > > > > > > diagnostic,
> > > > > > > > > > > > > > > SPIs diagnostic, etc).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Some usage examples:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cache groups and their partitions, which used
> by
> > > >
> > > > transaction
> > > > > > more
> > > > > > > > > > > > than
> > > > > > > > > > > > > 5
> > > > > > > > > > > > > > > minutes long:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > SELECT cg.CACHE_OR_GROUP_NAME,
> te.KEY_PARTITION,
> > count(*)
> > > >
> > > > AS
> > > > > > > > > > > > > ENTITIES_CNT
> > > > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_TRANSACTIONS t
> > > > > > > > > > > > > > > JOIN
> > INFORMATION_SCHEMA.IGNITE_TRANSACTION_ENTRIES te ON
> > > > >
> > > > > t.XID
> > > > > > =
> > > > > > > > > > > > > te.XID
> > > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHES c ON
> > te.CACHE_NAME =
> > > > > > >
> > > > > > > c.NAME
> > > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_CACHE_GROUPS cg
> ON
> > > >
> > > > c.GROUP_ID
> > > > > =
> > > > > > > > > > > cg.ID
> > > > > > > > > > > > > > > WHERE t.START_TIME < TIMESTAMPADD('MINUTE', -5,
> > NOW())
> > > > > > > > > > > > > > > GROUP BY cg.CACHE_OR_GROUP_NAME,
> te.KEY_PARTITION
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Average CPU load on server nodes grouped by
> > operating
> > > >
> > > > system:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > SELECT na.VALUE, COUNT(n.ID),
> > AVG(nm.AVG_CPU_LOAD)
> > > > >
> > > > > AVG_CPU_LOAD
> > > > > > > > > > > > > > > FROM INFORMATION_SCHEMA.IGNITE_NODES n
> > > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_ATTRIBUTES
> > na ON
> > > > >
> > > > > na.NODE_ID
> > > > > > > =
> > > > > > > > > > > > n.ID
> > > > > > > > > > > > > > AND
> > > > > > > > > > > > > > > na.NAME = 'os.name'
> > > > > > > > > > > > > > > JOIN INFORMATION_SCHEMA.IGNITE_NODE_METRICS nm
> ON
> > > >
> > > > nm.NODE_ID
> > > > > =
> > > > > > > > > > > n.ID
> > > > > > > > > > > > > > > WHERE n.IS_CLIENT = false
> > > > > > > > > > > > > > > GROUP BY na.VALUE
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Top 5 nodes by puts to cache ‘cache’:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > SELECT cm.NODE_ID, cm.CACHE_PUTS FROM
> > > > > > > > > > > > > > > INFORMATION_SCHEMA.IGNITE_CACHE_NODE_METRICS cm
> > > > > > > > > > > > > > > WHERE cm.CACHE_NAME = 'cache'
> > > > > > > > > > > > > > > ORDER BY cm.CACHE_PUTS DESC
> > > > > > > > > > > > > > > LIMIT 5
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Does this implementation interesting to someone
> > else? Maybe
> > > > >
> > > > > any
> > > > > > > > > > > views
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > redundant? Which additional first-priority
> views
> > must be
> > > > > > > > > > >
> > > > > > > > > > > implemented?
> > > > > > > > > > > > > Any
> > > > > > > > > > > > > > > other thoughts or proposal?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1] https://github.com/apache/ignite/pull/3413
> > > > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > >
> > >
> >
>