Apache Ignite Developers - Legacy Mail Archive

Sort nodes in the ring in order to minimize the number of reconnections

Classic

List

Threaded

41 messages Options

123

Alexei Scherbakov

Re: Sort nodes in the ring in order to minimize the number of reconnections

2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:

> >>
> My main concern here is code complexity. Yakov, how difficult it is to
> stick a new node in an arbitrary spot of a discovery ring?
> >>
>
> Dmitry, I think this is not hard. At least I don't see any issue now.
>
> >>
> I think the NodeComparator approach will work. User can chose how to sort
> nodes from one rack before nodes from another rack. Same goes for subnets,
> or data centers.
> >>
>
> Dmitry, can you please explain why you enforce user to write code? This
> does not seem convenient to me at all. If user wants to write code then he
> can do it for calculating proper arc_id.
>

Yakov, where is no need to for user to write code. We can provide two
default Comparator implementations:
first based on IP address(default), and second based on node attribute.
User just plugs one of the implementations and adds node attribute to node
config in second case - let it be ARC_ID by default.

>
> Another point I already posted to this thread - this is very error prone.
>
> >>
> I am strongly against giving user an opportunity to point exact place in
> the ring with somewhat like this interface [int getIdex(Node newNode,
> List<Node> currentRing)]. This is very error prone and may require tricky
> consistency checks just to make sure that implementation of this interface
> is consistent along the topology.
> With "arcs" approach user can automatically assign proper ids basing on
> physical network topology and network routes.
> >>
>
> I still think arc_id is better:
> 1. No code from user side. Only env variable or system property on a
> machine.
> 2. All code inside Ignite - easy to fix and change if required.
> 3. All benefits of comparator are still available.
>

I suppose my approach is more generic and also matches listed requirements.

>
> Alex, I still don't get how you (and other guys as well) want to deal with
> latencies here. I would like you explain how you solve this - you have 1000
> IP addresses, and you need to sort them in your beloved latency order, but
> please note that you need to get exactly the same ring on all of these 1000
> machines.
>

Calculating latencies are beyond scope of generic approach of nodes
ordering.
It's just of one of possible NodeComparator implementations.
Let's not bother this it right now.

>
> --Yakov
>

--

Best regards,
Alexei Scherbakov

daradurvs

Re: Sort nodes in the ring in order to minimize the number of reconnections

I have described a task: https://issues.apache.org/jira/browse/IGNITE-4501

and linked a bug https://issues.apache.org/jira/browse/IGNITE-4499

Alex Menshikov, maybe you will take her?

2016-12-27 13:32 GMT+03:00 Alexei Scherbakov <[hidden email]>:

> 2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:
>
> > >>
> > My main concern here is code complexity. Yakov, how difficult it is to
> > stick a new node in an arbitrary spot of a discovery ring?
> > >>
> >
> > Dmitry, I think this is not hard. At least I don't see any issue now.
> >
> > >>
> > I think the NodeComparator approach will work. User can chose how to sort
> > nodes from one rack before nodes from another rack. Same goes for
> subnets,
> > or data centers.
> > >>
> >
> > Dmitry, can you please explain why you enforce user to write code? This
> > does not seem convenient to me at all. If user wants to write code then
> he
> > can do it for calculating proper arc_id.
> >
>
> Yakov, where is no need to for user to write code. We can provide two
> default Comparator implementations:
> first based on IP address(default), and second based on node attribute.
> User just plugs one of the implementations and adds node attribute to node
> config in second case - let it be ARC_ID by default.
>
>
> >
> > Another point I already posted to this thread - this is very error prone.
> >
> > >>
> > I am strongly against giving user an opportunity to point exact place in
> > the ring with somewhat like this interface [int getIdex(Node newNode,
> > List<Node> currentRing)]. This is very error prone and may require tricky
> > consistency checks just to make sure that implementation of this
> interface
> > is consistent along the topology.
> > With "arcs" approach user can automatically assign proper ids basing on
> > physical network topology and network routes.
> > >>
> >
> > I still think arc_id is better:
> > 1. No code from user side. Only env variable or system property on a
> > machine.
> > 2. All code inside Ignite - easy to fix and change if required.
> > 3. All benefits of comparator are still available.
> >
>
> I suppose my approach is more generic and also matches listed requirements.
>
>
> >
> > Alex, I still don't get how you (and other guys as well) want to deal
> with
> > latencies here. I would like you explain how you solve this - you have
> 1000
> > IP addresses, and you need to sort them in your beloved latency order,
> but
> > please note that you need to get exactly the same ring on all of these
> 1000
> > machines.
> >
>
> Calculating latencies are beyond scope of generic approach of nodes
> ordering.
> It's just of one of possible NodeComparator implementations.
> Let's not bother this it right now.
>
>
> >
> > --Yakov
> >
>
>
>
> --
>
> Best regards,
> Alexei Scherbakov
>

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

Yes, i can. But someone needs to give me the rights of contributor in Jira.

2016-12-27 17:07 GMT+03:00 Vyacheslav Daradur <[hidden email]>:

> I have described a task: https://issues.apache.org/jira/browse/IGNITE-4501
>
> and linked a bug https://issues.apache.org/jira/browse/IGNITE-4499
>
> Alex Menshikov, maybe you will take her?
>
>
> 2016-12-27 13:32 GMT+03:00 Alexei Scherbakov <[hidden email]
> >:
>
> > 2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:
> >
> > > >>
> > > My main concern here is code complexity. Yakov, how difficult it is to
> > > stick a new node in an arbitrary spot of a discovery ring?
> > > >>
> > >
> > > Dmitry, I think this is not hard. At least I don't see any issue now.
> > >
> > > >>
> > > I think the NodeComparator approach will work. User can chose how to
> sort
> > > nodes from one rack before nodes from another rack. Same goes for
> > subnets,
> > > or data centers.
> > > >>
> > >
> > > Dmitry, can you please explain why you enforce user to write code? This
> > > does not seem convenient to me at all. If user wants to write code then
> > he
> > > can do it for calculating proper arc_id.
> > >
> >
> > Yakov, where is no need to for user to write code. We can provide two
> > default Comparator implementations:
> > first based on IP address(default), and second based on node attribute.
> > User just plugs one of the implementations and adds node attribute to
> node
> > config in second case - let it be ARC_ID by default.
> >
> >
> > >
> > > Another point I already posted to this thread - this is very error
> prone.
> > >
> > > >>
> > > I am strongly against giving user an opportunity to point exact place
> in
> > > the ring with somewhat like this interface [int getIdex(Node newNode,
> > > List<Node> currentRing)]. This is very error prone and may require
> tricky
> > > consistency checks just to make sure that implementation of this
> > interface
> > > is consistent along the topology.
> > > With "arcs" approach user can automatically assign proper ids basing on
> > > physical network topology and network routes.
> > > >>
> > >
> > > I still think arc_id is better:
> > > 1. No code from user side. Only env variable or system property on a
> > > machine.
> > > 2. All code inside Ignite - easy to fix and change if required.
> > > 3. All benefits of comparator are still available.
> > >
> >
> > I suppose my approach is more generic and also matches listed
> requirements.
> >
> >
> > >
> > > Alex, I still don't get how you (and other guys as well) want to deal
> > with
> > > latencies here. I would like you explain how you solve this - you have
> > 1000
> > > IP addresses, and you need to sort them in your beloved latency order,
> > but
> > > please note that you need to get exactly the same ring on all of these
> > 1000
> > > machines.
> > >
> >
> > Calculating latencies are beyond scope of generic approach of nodes
> > ordering.
> > It's just of one of possible NodeComparator implementations.
> > Let's not bother this it right now.
> >
> >
> > >
> > > --Yakov
> > >
> >
> >
> >
> > --
> >
> > Best regards,
> > Alexei Scherbakov
> >
>

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

My JIRA account is:
Username:sharplerFull Name:Alexander Menshikov

2016-12-27 17:22 GMT+03:00 Александр Меньшиков <[hidden email]>:

> Yes, i can. But someone needs to give me the rights of contributor in Jira.
>
> 2016-12-27 17:07 GMT+03:00 Vyacheslav Daradur <[hidden email]>:
>
>> I have described a task: https://issues.apache.org/jira
>> /browse/IGNITE-4501
>>
>> and linked a bug https://issues.apache.org/jira/browse/IGNITE-4499
>>
>> Alex Menshikov, maybe you will take her?
>>
>>
>> 2016-12-27 13:32 GMT+03:00 Alexei Scherbakov <
>> [hidden email]>:
>>
>> > 2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:
>> >
>> > > >>
>> > > My main concern here is code complexity. Yakov, how difficult it is to
>> > > stick a new node in an arbitrary spot of a discovery ring?
>> > > >>
>> > >
>> > > Dmitry, I think this is not hard. At least I don't see any issue now.
>> > >
>> > > >>
>> > > I think the NodeComparator approach will work. User can chose how to
>> sort
>> > > nodes from one rack before nodes from another rack. Same goes for
>> > subnets,
>> > > or data centers.
>> > > >>
>> > >
>> > > Dmitry, can you please explain why you enforce user to write code?
>> This
>> > > does not seem convenient to me at all. If user wants to write code
>> then
>> > he
>> > > can do it for calculating proper arc_id.
>> > >
>> >
>> > Yakov, where is no need to for user to write code. We can provide two
>> > default Comparator implementations:
>> > first based on IP address(default), and second based on node attribute.
>> > User just plugs one of the implementations and adds node attribute to
>> node
>> > config in second case - let it be ARC_ID by default.
>> >
>> >
>> > >
>> > > Another point I already posted to this thread - this is very error
>> prone.
>> > >
>> > > >>
>> > > I am strongly against giving user an opportunity to point exact place
>> in
>> > > the ring with somewhat like this interface [int getIdex(Node newNode,
>> > > List<Node> currentRing)]. This is very error prone and may require
>> tricky
>> > > consistency checks just to make sure that implementation of this
>> > interface
>> > > is consistent along the topology.
>> > > With "arcs" approach user can automatically assign proper ids basing
>> on
>> > > physical network topology and network routes.
>> > > >>
>> > >
>> > > I still think arc_id is better:
>> > > 1. No code from user side. Only env variable or system property on a
>> > > machine.
>> > > 2. All code inside Ignite - easy to fix and change if required.
>> > > 3. All benefits of comparator are still available.
>> > >
>> >
>> > I suppose my approach is more generic and also matches listed
>> requirements.
>> >
>> >
>> > >
>> > > Alex, I still don't get how you (and other guys as well) want to deal
>> > with
>> > > latencies here. I would like you explain how you solve this - you have
>> > 1000
>> > > IP addresses, and you need to sort them in your beloved latency order,
>> > but
>> > > please note that you need to get exactly the same ring on all of these
>> > 1000
>> > > machines.
>> > >
>> >
>> > Calculating latencies are beyond scope of generic approach of nodes
>> > ordering.
>> > It's just of one of possible NodeComparator implementations.
>> > Let's not bother this it right now.
>> >
>> >
>> > >
>> > > --Yakov
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Best regards,
>> > Alexei Scherbakov
>> >
>>
>
>

dsetrakyan

Re: Sort nodes in the ring in order to minimize the number of reconnections

In reply to this post by Alexei Scherbakov

On Tue, Dec 27, 2016 at 2:32 AM, Alexei Scherbakov <
[hidden email]> wrote:

> 2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:
> > I think the NodeComparator approach will work. User can chose how to sort
> > nodes from one rack before nodes from another rack. Same goes for
> subnets,
> > or data centers.
> > >>
> >
> > Dmitry, can you please explain why you enforce user to write code? This
> > does not seem convenient to me at all. If user wants to write code then
> he
> > can do it for calculating proper arc_id.
> >
>
> Yakov, where is no need to for user to write code. We can provide two
> default Comparator implementations:
> first based on IP address(default), and second based on node attribute.
> User just plugs one of the implementations and adds node attribute to node
> config in second case - let it be ARC_ID by default.

Completely agree with Alexey here. NodeComparator sounds like a generic
approach. We can provide various implementations of comparator with
different sorting strategies out of the box.

dsetrakyan

Re: Sort nodes in the ring in order to minimize the number of reconnections

Actually, after giving it some thought, I now think that the same kind of
flexibility can be achieved by giving multiple nodes the same
CLUSTER_REGION_ID (don't like the arc id). For example, nodes in 2 racks
could be given CLUSTER_REGION_ID of 1 and 2. This way all nodes in rack 1
or rack 2 would be next to each other in the cluster ring.

Do you think we will ever care about the order of nodes within the same
region, e.g. does the order of nodes within the same rack matter?

D.

On Tue, Dec 27, 2016 at 7:30 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

>
>
> On Tue, Dec 27, 2016 at 2:32 AM, Alexei Scherbakov <
> [hidden email]> wrote:
>
>> 2016-12-27 10:42 GMT+03:00 Yakov Zhdanov <[hidden email]>:
>> > I think the NodeComparator approach will work. User can chose how to
>> sort
>> > nodes from one rack before nodes from another rack. Same goes for
>> subnets,
>> > or data centers.
>> > >>
>> >
>> > Dmitry, can you please explain why you enforce user to write code? This
>> > does not seem convenient to me at all. If user wants to write code then
>> he
>> > can do it for calculating proper arc_id.
>> >
>>
>> Yakov, where is no need to for user to write code. We can provide two
>> default Comparator implementations:
>> first based on IP address(default), and second based on node attribute.
>> User just plugs one of the implementations and adds node attribute to node
>> config in second case - let it be ARC_ID by default.
>
>
> Completely agree with Alexey here. NodeComparator sounds like a generic
> approach. We can provide various implementations of comparator with
> different sorting strategies out of the box.
>

yzhdanov

Re: Sort nodes in the ring in order to minimize the number of reconnections

I am OK with CLUSTER_REGION_ID. However I would like this name -
DISCOVERY_REGION_ID - more.

>>
Do you think we will ever care about the order of nodes within the same
region, e.g. does the order of nodes within the same rack matter?
>>

I think this is too much, but if you care you still can. Imagine you have
1000..1010 values for machines in rack 1 and 2000..2010 values for machines
in rack 2. So you can set exact position for each machine. This approach
provides full flexibility.

I will update https://issues.apache.org/jira/browse/IGNITE-4501 shortly.

--Yakov

yzhdanov

Re: Sort nodes in the ring in order to minimize the number of reconnections

Guys, I have updated the ticket.

Alexander Menshikov, when do you expect the implementation to be finished?

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

I think that in the weeks after the 'new year' holidays or sooner.

2016-12-29 13:28 GMT+03:00 Yakov Zhdanov <[hidden email]>:

> Guys, I have updated the ticket.
>
> Alexander Menshikov, when do you expect the implementation to be finished?
>

yzhdanov

Re: Sort nodes in the ring in order to minimize the number of reconnections

Alexander, sounds good! Please post updates to ticket and this thread (if
necessary) while working.

--Yakov

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

I done that things:

-- Add to TcpDiscoverySpi field Comparator<TcpDiscoveryNode> nodeComparator
for load custom comparators from config file like bean.
-- Add implementation with old behavior: BaseNodeComparator
-- Add region id implementation: RegionNodeComparator which get map from IP
address to region ID in constructor.
-- Modified TcpDiscoveryNodesRing#nextNode for using nodeComparator for
find next node.

You can see that in PR: https://github.com/apache/ignite/pull/1436

Main question is: how to test it?

For my local test i just changed BaseNodeComparator with this odd
comparator:

new Comparator<TcpDiscoveryNode>() {
@Override
public int compare(TcpDiscoveryNode t1, TcpDiscoveryNode
t2) {
//shuffle nodes
final int ans =
Long.compare((t1.internalOrder()*3L+13L)%4L,
(t2.internalOrder()*3L+13L)%4L);
return (ans==0)?t1.compareTo(t2):ans;
}
};

It's looking scary, but in fact it just consistently shuffle nodes. If you
have 4 nodes with topology versions 1, 2, 3 and 4, it will be ring: 1-4-3-2.

So I think if we just using in old test this shuffle comparator and nothing
gone wrong it's good enough.

But any way I don't know how to add that to tests.

And may be we need some test for custom comparators. But in fact comparators
just must be valid Java comparator and work the same on all nodes.

Any comments are welcome.

yzhdanov

Re: Sort nodes in the ring in order to minimize the number of reconnections

Alexander, I was against any comparator and user defined logic exactly for
reason that comparison may be inconsistent. After long discussion and
consensus you implement approach with comparator. Can you please explain
why you did not just add logic to compare the value of CLUSTER_REGION_ID
node attribute?

--Yakov

2017-01-18 16:48 GMT+03:00 Александр Меньшиков <[hidden email]>:

> I done that things:
>
> -- Add to TcpDiscoverySpi field Comparator<TcpDiscoveryNode>
> nodeComparator for load custom comparators from config file like bean.
> -- Add implementation with old behavior: BaseNodeComparator
> -- Add region id implementation: RegionNodeComparator which get map from
> IP address to region ID in constructor.
> -- Modified TcpDiscoveryNodesRing#nextNode for using nodeComparator for
> find next node.
>
> You can see that in PR: https://github.com/apache/ignite/pull/1436
>
> Main question is: how to test it?
>
> For my local test i just changed BaseNodeComparator with this odd
> comparator:
>
> new Comparator<TcpDiscoveryNode>() {
> @Override
> public int compare(TcpDiscoveryNode t1, TcpDiscoveryNode
> t2) {
> //shuffle nodes
> final int ans = Long.compare((t1.internalOrder()*3L+13L)%4L,
> (t2.internalOrder()*3L+13L)%4L);
> return (ans==0)?t1.compareTo(t2):ans;
> }
> };
>
> It's looking scary, but in fact it just consistently shuffle nodes. If you
> have 4 nodes with topology versions 1, 2, 3 and 4, it will be ring: 1-4-3-2.
>
> So I think if we just using in old test this shuffle comparator and
> nothing gone wrong it's good enough.
>
> But any way I don't know how to add that to tests.
>
> And may be we need some test for custom comparators. But in fact comparators
> just must be valid Java comparator and work the same on all nodes.
>
> Any comments are welcome.
>

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

Yakov, as I understand it we need add CLUSTER_REGION_ID for each nodes in
config file. And in fact using some kind of sort in nextNode method (the
search for extreme values to be exact). And the existence of valid
comparator is a sufficient condition to sort nodes to build new correct
ring. So I has thought we will not get any extra benefits (performance or
maintainability) if we close the ability for users to set their sort logic.
Code will a similar in two variants. I has thought if I show this variant
will be easier to see that variant is okay.
But if not, then I can fast change code.

yzhdanov

Re: Sort nodes in the ring in order to minimize the number of reconnections

Alexander, as far as I remember we talked about having cluster id set via
TcpDiscoverySpi configuration and also via system property.

What do you mean by "existence of sufficient comparator"? If we require
that attribute is integer then we don't have any problems. If it is not
integer then you should throw exception on start.

I repeat this once again - we need to (1)prevent our users from falling
into terrible discovery issues (which are hard to debug) if users provide
inconsistent comparators on different nodes, but we also need to have
(2)full flexibility and control over nodes ordering. I think if we don't
have comparator on public API then we are still OK with both points above.
Therefore, I would like you to implement this feature using the approach we
agreed on before. Let me know if you want me to take a look at the code
before you proceed.

--Yakov

dsetrakyan

Re: Sort nodes in the ring in order to minimize the number of reconnections

I agree with Yakov. Having an integer as region ID should be sufficient to
support all the use cases.

On Thu, Jan 19, 2017 at 4:37 AM, Yakov Zhdanov <[hidden email]> wrote:

> Alexander, as far as I remember we talked about having cluster id set via
> TcpDiscoverySpi configuration and also via system property.
>
> What do you mean by "existence of sufficient comparator"? If we require
> that attribute is integer then we don't have any problems. If it is not
> integer then you should throw exception on start.
>
> I repeat this once again - we need to (1)prevent our users from falling
> into terrible discovery issues (which are hard to debug) if users provide
> inconsistent comparators on different nodes, but we also need to have
> (2)full flexibility and control over nodes ordering. I think if we don't
> have comparator on public API then we are still OK with both points above.
> Therefore, I would like you to implement this feature using the approach we
> agreed on before. Let me know if you want me to take a look at the code
> before you proceed.
>
> --Yakov
>

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

In reply to this post by yzhdanov

Yakov, I changed the implementation. Please, look at my code again.

https://github.com/apache/ignite/pull/1436

2017-01-19 15:37 GMT+03:00 Yakov Zhdanov <[hidden email]>:

Игорь Г

Re: Sort nodes in the ring in order to minimize the number of reconnections

Alexander, maybe you should use presorted collection in
TcpDiscoveryNodesRing.nextNode instead of iterating through unsorted one
every time?

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

Igor, I have thought about approach what you are talking about. It need add
new field named like "sortedNodes" with custom ordering, which will have
the same items as "nodes" field, because "nodes" has being used with
default ordering in other methods. It have this advantages:

1. Method "nextNode" will look simpler.
2. Method "nextNode" will work faster, because using of method
TreeSet#higher() will be available. But that possibility had not been used
in original code. And I don't why.

But also have some disadvantages because new field "sortedNodes" will be
strongly connected with "nodes":
1. It need copy-paste all code, which modifies "nodes" in 4 other methods.
It will decrease maintainability.
2. Field "nodes" is being used with "copy-on-write" algorithm. So state of
"nodes" and "sortedNodes" can be inconsistent. Maybe it's okay, in fact I
just don't know. But any way in future it may become a problem.

So my opinion is that "presorted" approach can work a little bit faster
(number of nodes never can't be so big that O(log n) became more faster
than O(n)), but code complexity will been increased, because it will add
one logic connection inside the whole class "TcpDiscoveryNodesRing".

Yakov, can you settle our argument?

2017-01-20 16:30 GMT+03:00 Игорь Г <[hidden email]>:

> Alexander, maybe you should use presorted collection in
> TcpDiscoveryNodesRing.nextNode instead of iterating through unsorted one
> every time?
>

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

Done. Please look.

JIRA: https://issues.apache.org/jira/browse/IGNITE-4501
PR: https://github.com/apache/ignite/pull/1436/files
Tests: http://ci.ignite.apache.org/project.html?projectId=IgniteTes
ts&tab=projectOverview&branch_IgniteTests=pull/1436/head

Александр Меньшиков

Re: Sort nodes in the ring in order to minimize the number of reconnections

Need to do code review until February 17, if we want to get this feature in
version 1.9.

2017-02-08 22:14 GMT+03:00 Александр Меньшиков <[hidden email]>:

> Done. Please look.
>
> JIRA: https://issues.apache.org/jira/browse/IGNITE-4501
> PR: https://github.com/apache/ignite/pull/1436/files
> Tests: http://ci.ignite.apache.org/project.html?projectId=IgniteTes
> ts&tab=projectOverview&branch_IgniteTests=pull/1436/head
>

123