Irrelevant data in discovery messages

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Irrelevant data in discovery messages

Denis Mekhanikov
Igniters,

Turns out, that we are sending a lot of irrelevant information in discovery
messages. Some messages contain *TcpDiscoveryNode* objects, which in turn
have such attributes like *PATH, java.class.path, sun.boot.class.path,
java.library.path, org.apache.ignite.jvm.args, *etc.
Some of these attributes may contain huge strings, that can sum up to
megabytes of data.

It was noticed by a user on our mailing list:
http://apache-ignite-users.70518.x6.nabble.com/Connection-problem-between-client-and-server-td19243.html
In his case these huge messages make discovery process really slow.

I think, we should filter-out such attributes, because they are not used
anywhere, but make messages grow enormous and slow down discovery. We could
include only user-defined and internal attributes + a fixed set of
environment variables.

What do you think?

Denis
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

dsetrakyan
Absolutely agree. I thought we could already filter out system and
environment properties on startup via configuration. I think it was
implemented by Yakov a long time ago. Yakov Zhdanov, can you please chime
in? (

Denis, the information you mention is static and does not change. It would
be enough to include it only into Join requests and not regular heartbeats.
I hope that it is already happening this way. Can you please confirm?

D.

On Wed, Jan 10, 2018 at 12:30 AM, Denis Mekhanikov <[hidden email]>
wrote:

> Igniters,
>
> Turns out, that we are sending a lot of irrelevant information in discovery
> messages. Some messages contain *TcpDiscoveryNode* objects, which in turn
> have such attributes like *PATH, java.class.path, sun.boot.class.path,
> java.library.path, org.apache.ignite.jvm.args, *etc.
> Some of these attributes may contain huge strings, that can sum up to
> megabytes of data.
>
> It was noticed by a user on our mailing list:
> http://apache-ignite-users.70518.x6.nabble.com/Connection-problem-between-
> client-and-server-td19243.html
> In his case these huge messages make discovery process really slow.
>
> I think, we should filter-out such attributes, because they are not used
> anywhere, but make messages grow enormous and slow down discovery. We could
> include only user-defined and internal attributes + a fixed set of
> environment variables.
>
> What do you think?
>
> Denis
>
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

Denis Mekhanikov
Dmitriy,

I looked through all implementations of messages in Ignite, and
only TcpDiscoveryNodeAddedMessage seems to contain this information. So, it
is sent only when new nodes are connecting.
But this message is pretty fat, and it grows with topology version. It
contains current topology and topology history, so all nodes with all of
their attributes may be repeated many times in one message.
If there is any configurable mechanism for filtering irrelevant attributes
out, it could be a workaround.

Denis

ср, 10 янв. 2018 г. в 22:32, Dmitriy Setrakyan <[hidden email]>:

> Absolutely agree. I thought we could already filter out system and
> environment properties on startup via configuration. I think it was
> implemented by Yakov a long time ago. Yakov Zhdanov, can you please chime
> in? (
>
> Denis, the information you mention is static and does not change. It would
> be enough to include it only into Join requests and not regular heartbeats.
> I hope that it is already happening this way. Can you please confirm?
>
> D.
>
> On Wed, Jan 10, 2018 at 12:30 AM, Denis Mekhanikov <[hidden email]>
> wrote:
>
> > Igniters,
> >
> > Turns out, that we are sending a lot of irrelevant information in
> discovery
> > messages. Some messages contain *TcpDiscoveryNode* objects, which in turn
> > have such attributes like *PATH, java.class.path, sun.boot.class.path,
> > java.library.path, org.apache.ignite.jvm.args, *etc.
> > Some of these attributes may contain huge strings, that can sum up to
> > megabytes of data.
> >
> > It was noticed by a user on our mailing list:
> >
> http://apache-ignite-users.70518.x6.nabble.com/Connection-problem-between-
> > client-and-server-td19243.html
> > In his case these huge messages make discovery process really slow.
> >
> > I think, we should filter-out such attributes, because they are not used
> > anywhere, but make messages grow enormous and slow down discovery. We
> could
> > include only user-defined and internal attributes + a fixed set of
> > environment variables.
> >
> > What do you think?
> >
> > Denis
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

Dmitriy Setrakyan-2
Denis, we have "setNodeAttributes(...)" [1] method on TCP discovery SPI. I
believe it may be used to provide which attributes to include during the
node-join process. Can you please take a look?

[1]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/spi/discovery/tcp/TcpDiscoverySpi.html#setNodeAttributes(java.util.Map,%20org.apache.ignite.lang.IgniteProductVersion)

D.

On Thu, Jan 11, 2018 at 10:15 AM, Denis Mekhanikov <[hidden email]>
wrote:

> Dmitriy,
>
> I looked through all implementations of messages in Ignite, and
> only TcpDiscoveryNodeAddedMessage seems to contain this information. So, it
> is sent only when new nodes are connecting.
> But this message is pretty fat, and it grows with topology version. It
> contains current topology and topology history, so all nodes with all of
> their attributes may be repeated many times in one message.
> If there is any configurable mechanism for filtering irrelevant attributes
> out, it could be a workaround.
>
> Denis
>
> ср, 10 янв. 2018 г. в 22:32, Dmitriy Setrakyan <[hidden email]>:
>
> > Absolutely agree. I thought we could already filter out system and
> > environment properties on startup via configuration. I think it was
> > implemented by Yakov a long time ago. Yakov Zhdanov, can you please chime
> > in? (
> >
> > Denis, the information you mention is static and does not change. It
> would
> > be enough to include it only into Join requests and not regular
> heartbeats.
> > I hope that it is already happening this way. Can you please confirm?
> >
> > D.
> >
> > On Wed, Jan 10, 2018 at 12:30 AM, Denis Mekhanikov <
> [hidden email]>
> > wrote:
> >
> > > Igniters,
> > >
> > > Turns out, that we are sending a lot of irrelevant information in
> > discovery
> > > messages. Some messages contain *TcpDiscoveryNode* objects, which in
> turn
> > > have such attributes like *PATH, java.class.path, sun.boot.class.path,
> > > java.library.path, org.apache.ignite.jvm.args, *etc.
> > > Some of these attributes may contain huge strings, that can sum up to
> > > megabytes of data.
> > >
> > > It was noticed by a user on our mailing list:
> > >
> > http://apache-ignite-users.70518.x6.nabble.com/
> Connection-problem-between-
> > > client-and-server-td19243.html
> > > In his case these huge messages make discovery process really slow.
> > >
> > > I think, we should filter-out such attributes, because they are not
> used
> > > anywhere, but make messages grow enormous and slow down discovery. We
> > could
> > > include only user-defined and internal attributes + a fixed set of
> > > environment variables.
> > >
> > > What do you think?
> > >
> > > Denis
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

yzhdanov
Guys, this config property allows to filter out system/env properties
- org.apache.ignite.configuration.IgniteConfiguration#getIncludeProperties

By default this returns null which is "include everything". Pass empty
array to filter all or non-empty for filtering.

--Yakov
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

Denis Mekhanikov
Thanks, Yakov!
Maybe we should change the default behaviour?
Any environment contains *PATH* and *java.class.path* variables at least,
and they are not used by anybody. I don't think we should include them to
node attributes by default.

Denis

пн, 15 янв. 2018 г. в 13:02, Yakov Zhdanov <[hidden email]>:

> Guys, this config property allows to filter out system/env properties
> - org.apache.ignite.configuration.IgniteConfiguration#getIncludeProperties
>
> By default this returns null which is "include everything". Pass empty
> array to filter all or non-empty for filtering.
>
> --Yakov
>
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

yzhdanov
What if we change default to empty array? Users that want to include
everything will init it to null manually

--Yakov
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

Denis Mekhanikov
At least *java.net.preferIPv4Stack* parameter shouldn't be ignored. We
check, that nodes have the same value of this parameter, and print a
warning otherwise. It is pretty helpful sometimes.
Maybe, some other parameters are also used internally. They should be added
to node attributes regardless of the configuration, I think.
Together with this the default value of *i**ncludeProperties* could be made
an empty array.

Denis

пн, 15 янв. 2018 г. в 14:39, Yakov Zhdanov <[hidden email]>:

> What if we change default to empty array? Users that want to include
> everything will init it to null manually
>
> --Yakov
>
Reply | Threaded
Open this post in threaded view
|

Re: Irrelevant data in discovery messages

yzhdanov
I agree with your suggestion to add required properties to node attributes
under specific key disregarding includeProperties value.

--Yakov