Apache Ignite Developers - Legacy Mail Archive

IgniteSemaphore and failoverSafe flag

Classic

List

Threaded

21 messages Options

agura

IgniteSemaphore and failoverSafe flag

Hi all!

Guys, could somebody explain semantic of failoverSafe flag in
IgniteSemaphore. From my point of view the test below should work but it
fails:

public void testFailoverReleasePermits() throws Exception {
Ignite ignite = grid(0);

IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true);

sem.acquire(1);

ignite.close();

U.sleep(5000);

ignite = grid(1);

sem = ignite.semaphore("sem", 1, true, true);

boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS);

assertTrue(acquire); // fails here
}

From my point of view permit should be available after the first ignite
instance left topology.

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hi,

when failoverSafe == true, semaphore should silently redistribute the
permits acquired on the failing node.
If failoverSafe is set to false, exception is thrown to every node
attempting to acquire.

It seems to me that when the first instance left topology,
no backups were available (this is similar to:
https://issues.apache.org/jira/browse/IGNITE-3386).
This should be fixed (semaphore should be recreated when create==true, as
suggested by Denis in the ticket).

It should be a minor fix, will be ready for 1.8.

Best regards,
Vladisav

On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote:

> Hi all!
>
> Guys, could somebody explain semantic of failoverSafe flag in
> IgniteSemaphore. From my point of view the test below should work but it
> fails:
>
> public void testFailoverReleasePermits() throws Exception {
> Ignite ignite = grid(0);
>
> IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true);
>
> sem.acquire(1);
>
> ignite.close();
>
> U.sleep(5000);
>
> ignite = grid(1);
>
> sem = ignite.semaphore("sem", 1, true, true);
>
> boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS);
>
> assertTrue(acquire); // fails here
> }
>
> From my point of view permit should be available after the first ignite
> instance left topology.
>

agura

Re: IgniteSemaphore and failoverSafe flag

Vladisav,

I've ran this test with partitioned cache and 1 backup and with replicated
cache (4 nodes in topology). Behavior is the same. I think it is bug. But
the first I wanted make sure that I understand failoverSafe flag correctly.

Thank you for reply. I'll create ticket.

On Tue, Nov 1, 2016 at 8:48 PM, Vladisav Jelisavcic <[hidden email]>
wrote:

> Hi,
>
> when failoverSafe == true, semaphore should silently redistribute the
> permits acquired on the failing node.
> If failoverSafe is set to false, exception is thrown to every node
> attempting to acquire.
>
> It seems to me that when the first instance left topology,
> no backups were available (this is similar to:
> https://issues.apache.org/jira/browse/IGNITE-3386).
> This should be fixed (semaphore should be recreated when create==true, as
> suggested by Denis in the ticket).
>
> It should be a minor fix, will be ready for 1.8.
>
> Best regards,
> Vladisav
>
>
>
>
>
>
>
>
> On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote:
>
> > Hi all!
> >
> > Guys, could somebody explain semantic of failoverSafe flag in
> > IgniteSemaphore. From my point of view the test below should work but it
> > fails:
> >
> > public void testFailoverReleasePermits() throws Exception {
> > Ignite ignite = grid(0);
> >
> > IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true);
> >
> > sem.acquire(1);
> >
> > ignite.close();
> >
> > U.sleep(5000);
> >
> > ignite = grid(1);
> >
> > sem = ignite.semaphore("sem", 1, true, true);
> >
> > boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS);
> >
> > assertTrue(acquire); // fails here
> > }
> >
> > From my point of view permit should be available after the first ignite
> > instance left topology.
> >
>

Alexey Goncharuk

Re: IgniteSemaphore and failoverSafe flag

Guys,

I was looking at this ticket and have a question related to the lock
semantics:

Suppose I have a node which has already acquired the lock, and then all
affinity nodes related to the lock leave topology. In this case, if we
automatically re-created the lock, we would end up having two lock owners
in the grid, which is unacceptable.

I think throwing an exception and forcing a user to re-create the lock by
himself is a correct way to resolve this.

2016-11-02 14:36 GMT+03:00 Andrey Gura <[hidden email]>:

> Vladisav,
>
> I've ran this test with partitioned cache and 1 backup and with replicated
> cache (4 nodes in topology). Behavior is the same. I think it is bug. But
> the first I wanted make sure that I understand failoverSafe flag correctly.
>
> Thank you for reply. I'll create ticket.
>
> On Tue, Nov 1, 2016 at 8:48 PM, Vladisav Jelisavcic <[hidden email]>
> wrote:
>
> > Hi,
> >
> > when failoverSafe == true, semaphore should silently redistribute the
> > permits acquired on the failing node.
> > If failoverSafe is set to false, exception is thrown to every node
> > attempting to acquire.
> >
> > It seems to me that when the first instance left topology,
> > no backups were available (this is similar to:
> > https://issues.apache.org/jira/browse/IGNITE-3386).
> > This should be fixed (semaphore should be recreated when create==true, as
> > suggested by Denis in the ticket).
> >
> > It should be a minor fix, will be ready for 1.8.
> >
> > Best regards,
> > Vladisav
> >
> >
> >
> >
> >
> >
> >
> >
> > On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote:
> >
> > > Hi all!
> > >
> > > Guys, could somebody explain semantic of failoverSafe flag in
> > > IgniteSemaphore. From my point of view the test below should work but
> it
> > > fails:
> > >
> > > public void testFailoverReleasePermits() throws Exception {
> > > Ignite ignite = grid(0);
> > >
> > > IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true);
> > >
> > > sem.acquire(1);
> > >
> > > ignite.close();
> > >
> > > U.sleep(5000);
> > >
> > > ignite = grid(1);
> > >
> > > sem = ignite.semaphore("sem", 1, true, true);
> > >
> > > boolean acquire = sem.tryAcquire(1, 5000,
> TimeUnit.MILLISECONDS);
> > >
> > > assertTrue(acquire); // fails here
> > > }
> > >
> > > From my point of view permit should be available after the first ignite
> > > instance left topology.
> > >
> >
>

dsetrakyan

Re: IgniteSemaphore and failoverSafe flag

On Tue, Mar 7, 2017 at 1:26 AM, Alexey Goncharuk <[hidden email]
> wrote:

> Guys,
>
> I was looking at this ticket and have a question related to the lock
> semantics:
>
> Suppose I have a node which has already acquired the lock, and then all
> affinity nodes related to the lock leave topology. In this case, if we
> automatically re-created the lock, we would end up having two lock owners
> in the grid, which is unacceptable.
>
> I think throwing an exception and forcing a user to re-create the lock by
> himself is a correct way to resolve this.
>

Which user operation would result in exception? To my knowledge, user may
already be holding the lock and not invoking any Ignite APIs, no?

Alexey Goncharuk

Re: IgniteSemaphore and failoverSafe flag

>
> Which user operation would result in exception? To my knowledge, user may
> already be holding the lock and not invoking any Ignite APIs, no?
>

Yes, this is exactly my point.

Imagine that a node already holds a lock and another node is waiting for
the lock. If all partition nodes leave the grid and the lock is re-created,
this second node will immediately acquire the lock and we will have two
lock owners. I think in this case this second node (blocked on lock())
should get an exception saying that the lock was lost (which is, by the
way, the current behavior), and the first node should get an exception on
unlock.

dsetrakyan

Re: IgniteSemaphore and failoverSafe flag

On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
[hidden email]> wrote:

> >
> > Which user operation would result in exception? To my knowledge, user may
> > already be holding the lock and not invoking any Ignite APIs, no?
> >
>
> Yes, this is exactly my point.
>
> Imagine that a node already holds a lock and another node is waiting for
> the lock. If all partition nodes leave the grid and the lock is re-created,
> this second node will immediately acquire the lock and we will have two
> lock owners. I think in this case this second node (blocked on lock())
> should get an exception saying that the lock was lost (which is, by the
> way, the current behavior), and the first node should get an exception on
> unlock.
>

Makes sense.

Valentin Kulichenko

Re: IgniteSemaphore and failoverSafe flag

Guys,

How does recreation of the lock helps? My understanding is that scenario is
the following:

1. Client A creates and acquires a lock, and then starts to execute guarded
logic.
2. Client B tries to acquire the same lock and parks to wait.
3. Before client A unlocks, all affinity nodes for the lock fail, lock
disappears from the cache.
4. Client B fails with exception, recreates the lock, acquires it, and
starts to execute guarded logic concurrently with client A.

In my view this is wrong anyway, regardless of whether this happens
silently or with an exception handled in user's code. Because this code
doesn't have any way to know if client A still holds the lock or not.

Am I missing something?

-Val

On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
> [hidden email]> wrote:
>
> > >
> > > Which user operation would result in exception? To my knowledge, user
> may
> > > already be holding the lock and not invoking any Ignite APIs, no?
> > >
> >
> > Yes, this is exactly my point.
> >
> > Imagine that a node already holds a lock and another node is waiting for
> > the lock. If all partition nodes leave the grid and the lock is
> re-created,
> > this second node will immediately acquire the lock and we will have two
> > lock owners. I think in this case this second node (blocked on lock())
> > should get an exception saying that the lock was lost (which is, by the
> > way, the current behavior), and the first node should get an exception on
> > unlock.
> >
>
> Makes sense.
>

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hi everyone,

I agree with Val, he's got a point; recreating the lock doesn't seem
possible
(at least not the with the transactional cache lock/semaphore we have).
Is this re-create behavior really needed?

Best regards,
Vladisav

On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
[hidden email]> wrote:

> Guys,
>
> How does recreation of the lock helps? My understanding is that scenario is
> the following:
>
> 1. Client A creates and acquires a lock, and then starts to execute guarded
> logic.
> 2. Client B tries to acquire the same lock and parks to wait.
> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
> disappears from the cache.
> 4. Client B fails with exception, recreates the lock, acquires it, and
> starts to execute guarded logic concurrently with client A.
>
> In my view this is wrong anyway, regardless of whether this happens
> silently or with an exception handled in user's code. Because this code
> doesn't have any way to know if client A still holds the lock or not.
>
> Am I missing something?
>
> -Val
>
> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <[hidden email]
> >
> wrote:
>
> > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
> > [hidden email]> wrote:
> >
> > > >
> > > > Which user operation would result in exception? To my knowledge, user
> > may
> > > > already be holding the lock and not invoking any Ignite APIs, no?
> > > >
> > >
> > > Yes, this is exactly my point.
> > >
> > > Imagine that a node already holds a lock and another node is waiting
> for
> > > the lock. If all partition nodes leave the grid and the lock is
> > re-created,
> > > this second node will immediately acquire the lock and we will have two
> > > lock owners. I think in this case this second node (blocked on lock())
> > > should get an exception saying that the lock was lost (which is, by the
> > > way, the current behavior), and the first node should get an exception
> on
> > > unlock.
> > >
> >
> > Makes sense.
> >
>

Alexey Goncharuk

Re: IgniteSemaphore and failoverSafe flag

I think re-creation should be handled by a user who will make sure that
nobody else is currently executing the guarded logic before the
re-creation. This is exactly the same semantics as with
BrokenBarrierException for j.u.c.CyclicBarrier.

2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:

> Hi everyone,
>
> I agree with Val, he's got a point; recreating the lock doesn't seem
> possible
> (at least not the with the transactional cache lock/semaphore we have).
> Is this re-create behavior really needed?
>
> Best regards,
> Vladisav
>
>
>
> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
> [hidden email]> wrote:
>
> > Guys,
> >
> > How does recreation of the lock helps? My understanding is that scenario
> is
> > the following:
> >
> > 1. Client A creates and acquires a lock, and then starts to execute
> guarded
> > logic.
> > 2. Client B tries to acquire the same lock and parks to wait.
> > 3. Before client A unlocks, all affinity nodes for the lock fail, lock
> > disappears from the cache.
> > 4. Client B fails with exception, recreates the lock, acquires it, and
> > starts to execute guarded logic concurrently with client A.
> >
> > In my view this is wrong anyway, regardless of whether this happens
> > silently or with an exception handled in user's code. Because this code
> > doesn't have any way to know if client A still holds the lock or not.
> >
> > Am I missing something?
> >
> > -Val
> >
> > On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
> [hidden email]
> > >
> > wrote:
> >
> > > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
> > > [hidden email]> wrote:
> > >
> > > > >
> > > > > Which user operation would result in exception? To my knowledge,
> user
> > > may
> > > > > already be holding the lock and not invoking any Ignite APIs, no?
> > > > >
> > > >
> > > > Yes, this is exactly my point.
> > > >
> > > > Imagine that a node already holds a lock and another node is waiting
> > for
> > > > the lock. If all partition nodes leave the grid and the lock is
> > > re-created,
> > > > this second node will immediately acquire the lock and we will have
> two
> > > > lock owners. I think in this case this second node (blocked on
> lock())
> > > > should get an exception saying that the lock was lost (which is, by
> the
> > > > way, the current behavior), and the first node should get an
> exception
> > on
> > > > unlock.
> > > >
> > >
> > > Makes sense.
> > >
> >
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Hi Vladislav,

I see you're developing [1] for a while, did you have any chance to fix
it? If no, is there any estimate?

[1] https://issues.apache.org/jira/browse/IGNITE-1977

Thanks!

-Dmitry.

20.03.2017 10:28, Alexey Goncharuk пишет:

> I think re-creation should be handled by a user who will make sure that
> nobody else is currently executing the guarded logic before the
> re-creation. This is exactly the same semantics as with
> BrokenBarrierException for j.u.c.CyclicBarrier.
>
> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:
>
>> Hi everyone,
>>
>> I agree with Val, he's got a point; recreating the lock doesn't seem
>> possible
>> (at least not the with the transactional cache lock/semaphore we have).
>> Is this re-create behavior really needed?
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>> [hidden email]> wrote:
>>
>>> Guys,
>>>
>>> How does recreation of the lock helps? My understanding is that scenario
>> is
>>> the following:
>>>
>>> 1. Client A creates and acquires a lock, and then starts to execute
>> guarded
>>> logic.
>>> 2. Client B tries to acquire the same lock and parks to wait.
>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>> disappears from the cache.
>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>> starts to execute guarded logic concurrently with client A.
>>>
>>> In my view this is wrong anyway, regardless of whether this happens
>>> silently or with an exception handled in user's code. Because this code
>>> doesn't have any way to know if client A still holds the lock or not.
>>>
>>> Am I missing something?
>>>
>>> -Val
>>>
>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>> [hidden email]
>>> wrote:
>>>
>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>> [hidden email]> wrote:
>>>>
>>>>>> Which user operation would result in exception? To my knowledge,
>> user
>>>> may
>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>
>>>>> Yes, this is exactly my point.
>>>>>
>>>>> Imagine that a node already holds a lock and another node is waiting
>>> for
>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>> re-created,
>>>>> this second node will immediately acquire the lock and we will have
>> two
>>>>> lock owners. I think in this case this second node (blocked on
>> lock())
>>>>> should get an exception saying that the lock was lost (which is, by
>> the
>>>>> way, the current behavior), and the first node should get an
>> exception
>>> on
>>>>> unlock.
>>>>>
>>>> Makes sense.
>>>>

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hey Dmitry,

sorry for the late reply, I'll try to bake a pr later during the day.

Best regards,
Vladisav

On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
[hidden email]> wrote:

> Hi Vladislav,
>
> I see you're developing [1] for a while, did you have any chance to fix
> it? If no, is there any estimate?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>
> Thanks!
>
> -Dmitry.
>
>
>
> 20.03.2017 10:28, Alexey Goncharuk пишет:
>
> I think re-creation should be handled by a user who will make sure that
>> nobody else is currently executing the guarded logic before the
>> re-creation. This is exactly the same semantics as with
>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>
>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:
>>
>> Hi everyone,
>>>
>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>> possible
>>> (at least not the with the transactional cache lock/semaphore we have).
>>> Is this re-create behavior really needed?
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>> [hidden email]> wrote:
>>>
>>> Guys,
>>>>
>>>> How does recreation of the lock helps? My understanding is that scenario
>>>>
>>> is
>>>
>>>> the following:
>>>>
>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>
>>> guarded
>>>
>>>> logic.
>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>> disappears from the cache.
>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>> starts to execute guarded logic concurrently with client A.
>>>>
>>>> In my view this is wrong anyway, regardless of whether this happens
>>>> silently or with an exception handled in user's code. Because this code
>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>
>>>> Am I missing something?
>>>>
>>>> -Val
>>>>
>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>
>>> [hidden email]
>>>
>>>> wrote:
>>>>
>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>> [hidden email]> wrote:
>>>>>
>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>
>>>>>> user
>>>
>>>> may
>>>>>
>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>
>>>>>>> Yes, this is exactly my point.
>>>>>>
>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>
>>>>> for
>>>>
>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>
>>>>> re-created,
>>>>>
>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>
>>>>> two
>>>
>>>> lock owners. I think in this case this second node (blocked on
>>>>>>
>>>>> lock())
>>>
>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>
>>>>> the
>>>
>>>> way, the current behavior), and the first node should get an
>>>>>>
>>>>> exception
>>>
>>>> on
>>>>
>>>>> unlock.
>>>>>>
>>>>>> Makes sense.
>>>>>
>>>>>
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Hi Vladislav,

Thanks for your contribution! But it seems doesn't fix related tickets,
in particular [1].
Could you please take a look?

[1] https://issues.apache.org/jira/browse/IGNITE-4173

Thanks!

06.04.2017 16:27, Vladisav Jelisavcic пишет:

> Hey Dmitry,
>
> sorry for the late reply, I'll try to bake a pr later during the day.
>
> Best regards,
> Vladisav
>
>
>
> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
> Hi Vladislav,
>
> I see you're developing [1] for a while, did you have any chance
> to fix it? If no, is there any estimate?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-1977
> <https://issues.apache.org/jira/browse/IGNITE-1977>
>
> Thanks!
>
> -Dmitry.
>
>
>
> 20.03.2017 10:28, Alexey Goncharuk пишет:
>
> I think re-creation should be handled by a user who will make
> sure that
> nobody else is currently executing the guarded logic before the
> re-creation. This is exactly the same semantics as with
> BrokenBarrierException for j.u.c.CyclicBarrier.
>
> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
> <[hidden email] <mailto:[hidden email]>>:
>
> Hi everyone,
>
> I agree with Val, he's got a point; recreating the lock
> doesn't seem
> possible
> (at least not the with the transactional cache
> lock/semaphore we have).
> Is this re-create behavior really needed?
>
> Best regards,
> Vladisav
>
>
>
> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
> [hidden email]
> <mailto:[hidden email]>> wrote:
>
> Guys,
>
> How does recreation of the lock helps? My
> understanding is that scenario
>
> is
>
> the following:
>
> 1. Client A creates and acquires a lock, and then
> starts to execute
>
> guarded
>
> logic.
> 2. Client B tries to acquire the same lock and parks
> to wait.
> 3. Before client A unlocks, all affinity nodes for the
> lock fail, lock
> disappears from the cache.
> 4. Client B fails with exception, recreates the lock,
> acquires it, and
> starts to execute guarded logic concurrently with
> client A.
>
> In my view this is wrong anyway, regardless of whether
> this happens
> silently or with an exception handled in user's code.
> Because this code
> doesn't have any way to know if client A still holds
> the lock or not.
>
> Am I missing something?
>
> -Val
>
> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>
> [hidden email] <mailto:[hidden email]>
>
> wrote:
>
> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
> [hidden email]
> <mailto:[hidden email]>> wrote:
>
> Which user operation would result in
> exception? To my knowledge,
>
> user
>
> may
>
> already be holding the lock and not
> invoking any Ignite APIs, no?
>
> Yes, this is exactly my point.
>
> Imagine that a node already holds a lock and
> another node is waiting
>
> for
>
> the lock. If all partition nodes leave the
> grid and the lock is
>
> re-created,
>
> this second node will immediately acquire the
> lock and we will have
>
> two
>
> lock owners. I think in this case this second
> node (blocked on
>
> lock())
>
> should get an exception saying that the lock
> was lost (which is, by
>
> the
>
> way, the current behavior), and the first node
> should get an
>
> exception
>
> on
>
> unlock.
>
> Makes sense.
>
>
>

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hi Dmitry,

sure, I made a fix, take a look at the PR and the comments in the ticket.

Best regards,
Vladisav

On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
[hidden email]> wrote:

> Hi Vladislav,
>
> Thanks for your contribution! But it seems doesn't fix related tickets, in
> particular [1].
> Could you please take a look?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>
> Thanks!
>
> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>
> Hey Dmitry,
>
> sorry for the late reply, I'll try to bake a pr later during the day.
>
> Best regards,
> Vladisav
>
>
>
> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
> [hidden email]> wrote:
>
>> Hi Vladislav,
>>
>> I see you're developing [1] for a while, did you have any chance to fix
>> it? If no, is there any estimate?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>
>> Thanks!
>>
>> -Dmitry.
>>
>>
>>
>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>
>> I think re-creation should be handled by a user who will make sure that
>>> nobody else is currently executing the guarded logic before the
>>> re-creation. This is exactly the same semantics as with
>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>
>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:
>>>
>>> Hi everyone,
>>>>
>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>> possible
>>>> (at least not the with the transactional cache lock/semaphore we have).
>>>> Is this re-create behavior really needed?
>>>>
>>>> Best regards,
>>>> Vladisav
>>>>
>>>>
>>>>
>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>> [hidden email]> wrote:
>>>>
>>>> Guys,
>>>>>
>>>>> How does recreation of the lock helps? My understanding is that
>>>>> scenario
>>>>>
>>>> is
>>>>
>>>>> the following:
>>>>>
>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>
>>>> guarded
>>>>
>>>>> logic.
>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>>> disappears from the cache.
>>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>>> starts to execute guarded logic concurrently with client A.
>>>>>
>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>> silently or with an exception handled in user's code. Because this code
>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>> -Val
>>>>>
>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>
>>>> [hidden email]
>>>>
>>>>> wrote:
>>>>>
>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>> [hidden email]> wrote:
>>>>>>
>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>
>>>>>>> user
>>>>
>>>>> may
>>>>>>
>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>
>>>>>>>> Yes, this is exactly my point.
>>>>>>>
>>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>>
>>>>>> for
>>>>>
>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>
>>>>>> re-created,
>>>>>>
>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>
>>>>>> two
>>>>
>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>
>>>>>> lock())
>>>>
>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>
>>>>>> the
>>>>
>>>>> way, the current behavior), and the first node should get an
>>>>>>>
>>>>>> exception
>>>>
>>>>> on
>>>>>
>>>>>> unlock.
>>>>>>>
>>>>>>> Makes sense.
>>>>>>
>>>>>>
>>
>
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Thanks a lot!

12.04.2017 16:35, Vladisav Jelisavcic пишет:

> Hi Dmitry,
>
> sure, I made a fix, take a look at the PR and the comments in the ticket.
>
> Best regards,
> Vladisav
>
> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
> Hi Vladislav,
>
> Thanks for your contribution! But it seems doesn't fix related
> tickets, in particular [1].
> Could you please take a look?
>
> [1] https://issues.apache.org/jira/browse/IGNITE-4173
> <https://issues.apache.org/jira/browse/IGNITE-4173>
>
> Thanks!
>
> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>> Hey Dmitry,
>>
>> sorry for the late reply, I'll try to bake a pr later during the day.
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>> <[hidden email] <mailto:[hidden email]>>
>> wrote:
>>
>> Hi Vladislav,
>>
>> I see you're developing [1] for a while, did you have any
>> chance to fix it? If no, is there any estimate?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>> <https://issues.apache.org/jira/browse/IGNITE-1977>
>>
>> Thanks!
>>
>> -Dmitry.
>>
>>
>>
>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>
>> I think re-creation should be handled by a user who will
>> make sure that
>> nobody else is currently executing the guarded logic
>> before the
>> re-creation. This is exactly the same semantics as with
>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>
>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>> <[hidden email] <mailto:[hidden email]>>:
>>
>> Hi everyone,
>>
>> I agree with Val, he's got a point; recreating the
>> lock doesn't seem
>> possible
>> (at least not the with the transactional cache
>> lock/semaphore we have).
>> Is this re-create behavior really needed?
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>> [hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Guys,
>>
>> How does recreation of the lock helps? My
>> understanding is that scenario
>>
>> is
>>
>> the following:
>>
>> 1. Client A creates and acquires a lock, and then
>> starts to execute
>>
>> guarded
>>
>> logic.
>> 2. Client B tries to acquire the same lock and
>> parks to wait.
>> 3. Before client A unlocks, all affinity nodes
>> for the lock fail, lock
>> disappears from the cache.
>> 4. Client B fails with exception, recreates the
>> lock, acquires it, and
>> starts to execute guarded logic concurrently with
>> client A.
>>
>> In my view this is wrong anyway, regardless of
>> whether this happens
>> silently or with an exception handled in user's
>> code. Because this code
>> doesn't have any way to know if client A still
>> holds the lock or not.
>>
>> Am I missing something?
>>
>> -Val
>>
>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>
>> [hidden email] <mailto:[hidden email]>
>>
>> wrote:
>>
>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey
>> Goncharuk <
>> [hidden email]
>> <mailto:[hidden email]>> wrote:
>>
>> Which user operation would result in
>> exception? To my knowledge,
>>
>> user
>>
>> may
>>
>> already be holding the lock and not
>> invoking any Ignite APIs, no?
>>
>> Yes, this is exactly my point.
>>
>> Imagine that a node already holds a lock
>> and another node is waiting
>>
>> for
>>
>> the lock. If all partition nodes leave
>> the grid and the lock is
>>
>> re-created,
>>
>> this second node will immediately acquire
>> the lock and we will have
>>
>> two
>>
>> lock owners. I think in this case this
>> second node (blocked on
>>
>> lock())
>>
>> should get an exception saying that the
>> lock was lost (which is, by
>>
>> the
>>
>> way, the current behavior), and the first
>> node should get an
>>
>> exception
>>
>> on
>>
>> unlock.
>>
>> Makes sense.
>>
>>
>>
>
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Hi Vladislav,

It looks like after fix was merged these tests [1] started failing.
Could you please take a look?

[1]
http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures

Thanks!

-Dmitry.

13.04.2017 16:15, Dmitry Karachentsev пишет:

> Thanks a lot!
>
> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>> Hi Dmitry,
>>
>> sure, I made a fix, take a look at the PR and the comments in the ticket.
>>
>> Best regards,
>> Vladisav
>>
>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> Hi Vladislav,
>>
>> Thanks for your contribution! But it seems doesn't fix related
>> tickets, in particular [1].
>> Could you please take a look?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>> <https://issues.apache.org/jira/browse/IGNITE-4173>
>>
>> Thanks!
>>
>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>> Hey Dmitry,
>>>
>>> sorry for the late reply, I'll try to bake a pr later during the
>>> day.
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>> <[hidden email] <mailto:[hidden email]>>
>>> wrote:
>>>
>>> Hi Vladislav,
>>>
>>> I see you're developing [1] for a while, did you have any
>>> chance to fix it? If no, is there any estimate?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>> <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>
>>> Thanks!
>>>
>>> -Dmitry.
>>>
>>>
>>>
>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>
>>> I think re-creation should be handled by a user who will
>>> make sure that
>>> nobody else is currently executing the guarded logic
>>> before the
>>> re-creation. This is exactly the same semantics as with
>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>
>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>> <[hidden email] <mailto:[hidden email]>>:
>>>
>>> Hi everyone,
>>>
>>> I agree with Val, he's got a point; recreating the
>>> lock doesn't seem
>>> possible
>>> (at least not the with the transactional cache
>>> lock/semaphore we have).
>>> Is this re-create behavior really needed?
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>> [hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>> Guys,
>>>
>>> How does recreation of the lock helps? My
>>> understanding is that scenario
>>>
>>> is
>>>
>>> the following:
>>>
>>> 1. Client A creates and acquires a lock, and
>>> then starts to execute
>>>
>>> guarded
>>>
>>> logic.
>>> 2. Client B tries to acquire the same lock and
>>> parks to wait.
>>> 3. Before client A unlocks, all affinity nodes
>>> for the lock fail, lock
>>> disappears from the cache.
>>> 4. Client B fails with exception, recreates the
>>> lock, acquires it, and
>>> starts to execute guarded logic concurrently
>>> with client A.
>>>
>>> In my view this is wrong anyway, regardless of
>>> whether this happens
>>> silently or with an exception handled in user's
>>> code. Because this code
>>> doesn't have any way to know if client A still
>>> holds the lock or not.
>>>
>>> Am I missing something?
>>>
>>> -Val
>>>
>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>> Setrakyan <
>>>
>>> [hidden email] <mailto:[hidden email]>
>>>
>>> wrote:
>>>
>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey
>>> Goncharuk <
>>> [hidden email]
>>> <mailto:[hidden email]>> wrote:
>>>
>>> Which user operation would result in
>>> exception? To my knowledge,
>>>
>>> user
>>>
>>> may
>>>
>>> already be holding the lock and not
>>> invoking any Ignite APIs, no?
>>>
>>> Yes, this is exactly my point.
>>>
>>> Imagine that a node already holds a lock
>>> and another node is waiting
>>>
>>> for
>>>
>>> the lock. If all partition nodes leave
>>> the grid and the lock is
>>>
>>> re-created,
>>>
>>> this second node will immediately
>>> acquire the lock and we will have
>>>
>>> two
>>>
>>> lock owners. I think in this case this
>>> second node (blocked on
>>>
>>> lock())
>>>
>>> should get an exception saying that the
>>> lock was lost (which is, by
>>>
>>> the
>>>
>>> way, the current behavior), and the
>>> first node should get an
>>>
>>> exception
>>>
>>> on
>>>
>>> unlock.
>>>
>>> Makes sense.
>>>
>>>
>>>
>>
>>
>

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hi Dmitry,

it looks to me that this test is not valid - after the semaphore 2 fails
the permits are redistributed
so the expected number of permits should really be 20 not 10. Do you agree?

I guess before latest fix this test was (incorrectly) passing because
permits weren't released properly.

What do you think?

On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev <
[hidden email]> wrote:

> Hi Vladislav,
>
> It looks like after fix was merged these tests [1] started failing. Could
> you please take a look?
>
> [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&
> tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucut
> ures
>
> Thanks!
>
> -Dmitry.
>
> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>
> Thanks a lot!
>
> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>
> Hi Dmitry,
>
> sure, I made a fix, take a look at the PR and the comments in the ticket.
>
> Best regards,
> Vladisav
>
> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
> [hidden email]> wrote:
>
>> Hi Vladislav,
>>
>> Thanks for your contribution! But it seems doesn't fix related tickets,
>> in particular [1].
>> Could you please take a look?
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>
>> Thanks!
>>
>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>
>> Hey Dmitry,
>>
>> sorry for the late reply, I'll try to bake a pr later during the day.
>>
>> Best regards,
>> Vladisav
>>
>>
>>
>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
>> [hidden email]> wrote:
>>
>>> Hi Vladislav,
>>>
>>> I see you're developing [1] for a while, did you have any chance to fix
>>> it? If no, is there any estimate?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>
>>> Thanks!
>>>
>>> -Dmitry.
>>>
>>>
>>>
>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>
>>> I think re-creation should be handled by a user who will make sure that
>>>> nobody else is currently executing the guarded logic before the
>>>> re-creation. This is exactly the same semantics as with
>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>
>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:
>>>>
>>>> Hi everyone,
>>>>>
>>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>>> possible
>>>>> (at least not the with the transactional cache lock/semaphore we have).
>>>>> Is this re-create behavior really needed?
>>>>>
>>>>> Best regards,
>>>>> Vladisav
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>>> [hidden email]> wrote:
>>>>>
>>>>> Guys,
>>>>>>
>>>>>> How does recreation of the lock helps? My understanding is that
>>>>>> scenario
>>>>>>
>>>>> is
>>>>>
>>>>>> the following:
>>>>>>
>>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>>
>>>>> guarded
>>>>>
>>>>>> logic.
>>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock
>>>>>> disappears from the cache.
>>>>>> 4. Client B fails with exception, recreates the lock, acquires it, and
>>>>>> starts to execute guarded logic concurrently with client A.
>>>>>>
>>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>>> silently or with an exception handled in user's code. Because this
>>>>>> code
>>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>>
>>>>>> Am I missing something?
>>>>>>
>>>>>> -Val
>>>>>>
>>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>>
>>>>> [hidden email]
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>>> [hidden email]> wrote:
>>>>>>>
>>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>>
>>>>>>>> user
>>>>>
>>>>>> may
>>>>>>>
>>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>>
>>>>>>>>> Yes, this is exactly my point.
>>>>>>>>
>>>>>>>> Imagine that a node already holds a lock and another node is waiting
>>>>>>>>
>>>>>>> for
>>>>>>
>>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>>
>>>>>>> re-created,
>>>>>>>
>>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>>
>>>>>>> two
>>>>>
>>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>>
>>>>>>> lock())
>>>>>
>>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>>
>>>>>>> the
>>>>>
>>>>>> way, the current behavior), and the first node should get an
>>>>>>>>
>>>>>>> exception
>>>>>
>>>>>> on
>>>>>>
>>>>>>> unlock.
>>>>>>>>
>>>>>>>> Makes sense.
>>>>>>>
>>>>>>>
>>>
>>
>>
>
>
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Vladislav,

Yep, you're right. I'll fix it.

Thanks!

14.04.2017 15:18, Vladisav Jelisavcic пишет:

> Hi Dmitry,
>
> it looks to me that this test is not valid - after the semaphore 2
> fails the permits are redistributed
> so the expected number of permits should really be 20 not 10. Do you
> agree?
>
> I guess before latest fix this test was (incorrectly) passing because
> permits weren't released properly.
>
> What do you think?
>
> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev
> <[hidden email] <mailto:[hidden email]>> wrote:
>
> Hi Vladislav,
>
> It looks like after fix was merged these tests [1] started
> failing. Could you please take a look?
>
> [1]
> http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
> <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>
>
> Thanks!
>
> -Dmitry.
>
> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>> Thanks a lot!
>>
>> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>>> Hi Dmitry,
>>>
>>> sure, I made a fix, take a look at the PR and the comments in
>>> the ticket.
>>>
>>> Best regards,
>>> Vladisav
>>>
>>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>>> <[hidden email] <mailto:[hidden email]>>
>>> wrote:
>>>
>>> Hi Vladislav,
>>>
>>> Thanks for your contribution! But it seems doesn't fix
>>> related tickets, in particular [1].
>>> Could you please take a look?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>> <https://issues.apache.org/jira/browse/IGNITE-4173>
>>>
>>> Thanks!
>>>
>>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>>> Hey Dmitry,
>>>>
>>>> sorry for the late reply, I'll try to bake a pr later
>>>> during the day.
>>>>
>>>> Best regards,
>>>> Vladisav
>>>>
>>>>
>>>>
>>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>> <[hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>> Hi Vladislav,
>>>>
>>>> I see you're developing [1] for a while, did you have
>>>> any chance to fix it? If no, is there any estimate?
>>>>
>>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>> <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>>
>>>> Thanks!
>>>>
>>>> -Dmitry.
>>>>
>>>>
>>>>
>>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>>
>>>> I think re-creation should be handled by a user who
>>>> will make sure that
>>>> nobody else is currently executing the guarded
>>>> logic before the
>>>> re-creation. This is exactly the same semantics as with
>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>
>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>> <[hidden email] <mailto:[hidden email]>>:
>>>>
>>>> Hi everyone,
>>>>
>>>> I agree with Val, he's got a point; recreating
>>>> the lock doesn't seem
>>>> possible
>>>> (at least not the with the transactional cache
>>>> lock/semaphore we have).
>>>> Is this re-create behavior really needed?
>>>>
>>>> Best regards,
>>>> Vladisav
>>>>
>>>>
>>>>
>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin
>>>> Kulichenko <
>>>> [hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>> Guys,
>>>>
>>>> How does recreation of the lock helps? My
>>>> understanding is that scenario
>>>>
>>>> is
>>>>
>>>> the following:
>>>>
>>>> 1. Client A creates and acquires a lock,
>>>> and then starts to execute
>>>>
>>>> guarded
>>>>
>>>> logic.
>>>> 2. Client B tries to acquire the same lock
>>>> and parks to wait.
>>>> 3. Before client A unlocks, all affinity
>>>> nodes for the lock fail, lock
>>>> disappears from the cache.
>>>> 4. Client B fails with exception, recreates
>>>> the lock, acquires it, and
>>>> starts to execute guarded logic
>>>> concurrently with client A.
>>>>
>>>> In my view this is wrong anyway, regardless
>>>> of whether this happens
>>>> silently or with an exception handled in
>>>> user's code. Because this code
>>>> doesn't have any way to know if client A
>>>> still holds the lock or not.
>>>>
>>>> Am I missing something?
>>>>
>>>> -Val
>>>>
>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>>> Setrakyan <
>>>>
>>>> [hidden email]
>>>> <mailto:[hidden email]>
>>>>
>>>> wrote:
>>>>
>>>> On Tue, Mar 14, 2017 at 12:46 AM,
>>>> Alexey Goncharuk <
>>>> [hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>> Which user operation would
>>>> result in exception? To my
>>>> knowledge,
>>>>
>>>> user
>>>>
>>>> may
>>>>
>>>> already be holding the lock and
>>>> not invoking any Ignite APIs, no?
>>>>
>>>> Yes, this is exactly my point.
>>>>
>>>> Imagine that a node already holds a
>>>> lock and another node is waiting
>>>>
>>>> for
>>>>
>>>> the lock. If all partition nodes
>>>> leave the grid and the lock is
>>>>
>>>> re-created,
>>>>
>>>> this second node will immediately
>>>> acquire the lock and we will have
>>>>
>>>> two
>>>>
>>>> lock owners. I think in this case
>>>> this second node (blocked on
>>>>
>>>> lock())
>>>>
>>>> should get an exception saying that
>>>> the lock was lost (which is, by
>>>>
>>>> the
>>>>
>>>> way, the current behavior), and the
>>>> first node should get an
>>>>
>>>> exception
>>>>
>>>> on
>>>>
>>>> unlock.
>>>>
>>>> Makes sense.
>>>>
>>>>
>>>>
>>>
>>>
>>
>
>

dkarachentsev

Re: IgniteSemaphore and failoverSafe flag

Vladislav,

One more thing, This test [1] started failing on semaphore close when
this fix [2] was introduced.
Could you check it please?

[1]
http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050
[2] https://issues.apache.org/jira/browse/IGNITE-1977

Thanks!

14.04.2017 15:27, Dmitry Karachentsev пишет:

> Vladislav,
>
> Yep, you're right. I'll fix it.
>
> Thanks!
>
> 14.04.2017 15:18, Vladisav Jelisavcic пишет:
>> Hi Dmitry,
>>
>> it looks to me that this test is not valid - after the semaphore 2
>> fails the permits are redistributed
>> so the expected number of permits should really be 20 not 10. Do you
>> agree?
>>
>> I guess before latest fix this test was (incorrectly) passing because
>> permits weren't released properly.
>>
>> What do you think?
>>
>> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev
>> <[hidden email] <mailto:[hidden email]>> wrote:
>>
>> Hi Vladislav,
>>
>> It looks like after fix was merged these tests [1] started
>> failing. Could you please take a look?
>>
>> [1]
>> http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures
>> <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures>
>>
>> Thanks!
>>
>> -Dmitry.
>>
>> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>>> Thanks a lot!
>>>
>>> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>>>> Hi Dmitry,
>>>>
>>>> sure, I made a fix, take a look at the PR and the comments in
>>>> the ticket.
>>>>
>>>> Best regards,
>>>> Vladisav
>>>>
>>>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev
>>>> <[hidden email]
>>>> <mailto:[hidden email]>> wrote:
>>>>
>>>> Hi Vladislav,
>>>>
>>>> Thanks for your contribution! But it seems doesn't fix
>>>> related tickets, in particular [1].
>>>> Could you please take a look?
>>>>
>>>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>> <https://issues.apache.org/jira/browse/IGNITE-4173>
>>>>
>>>> Thanks!
>>>>
>>>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>>>> Hey Dmitry,
>>>>>
>>>>> sorry for the late reply, I'll try to bake a pr later
>>>>> during the day.
>>>>>
>>>>> Best regards,
>>>>> Vladisav
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev
>>>>> <[hidden email]
>>>>> <mailto:[hidden email]>> wrote:
>>>>>
>>>>> Hi Vladislav,
>>>>>
>>>>> I see you're developing [1] for a while, did you have
>>>>> any chance to fix it? If no, is there any estimate?
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>>> <https://issues.apache.org/jira/browse/IGNITE-1977>
>>>>>
>>>>> Thanks!
>>>>>
>>>>> -Dmitry.
>>>>>
>>>>>
>>>>>
>>>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>>>
>>>>> I think re-creation should be handled by a user
>>>>> who will make sure that
>>>>> nobody else is currently executing the guarded
>>>>> logic before the
>>>>> re-creation. This is exactly the same semantics as
>>>>> with
>>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>>
>>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic
>>>>> <[hidden email] <mailto:[hidden email]>>:
>>>>>
>>>>> Hi everyone,
>>>>>
>>>>> I agree with Val, he's got a point; recreating
>>>>> the lock doesn't seem
>>>>> possible
>>>>> (at least not the with the transactional cache
>>>>> lock/semaphore we have).
>>>>> Is this re-create behavior really needed?
>>>>>
>>>>> Best regards,
>>>>> Vladisav
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin
>>>>> Kulichenko <
>>>>> [hidden email]
>>>>> <mailto:[hidden email]>> wrote:
>>>>>
>>>>> Guys,
>>>>>
>>>>> How does recreation of the lock helps? My
>>>>> understanding is that scenario
>>>>>
>>>>> is
>>>>>
>>>>> the following:
>>>>>
>>>>> 1. Client A creates and acquires a lock,
>>>>> and then starts to execute
>>>>>
>>>>> guarded
>>>>>
>>>>> logic.
>>>>> 2. Client B tries to acquire the same lock
>>>>> and parks to wait.
>>>>> 3. Before client A unlocks, all affinity
>>>>> nodes for the lock fail, lock
>>>>> disappears from the cache.
>>>>> 4. Client B fails with exception,
>>>>> recreates the lock, acquires it, and
>>>>> starts to execute guarded logic
>>>>> concurrently with client A.
>>>>>
>>>>> In my view this is wrong anyway,
>>>>> regardless of whether this happens
>>>>> silently or with an exception handled in
>>>>> user's code. Because this code
>>>>> doesn't have any way to know if client A
>>>>> still holds the lock or not.
>>>>>
>>>>> Am I missing something?
>>>>>
>>>>> -Val
>>>>>
>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy
>>>>> Setrakyan <
>>>>>
>>>>> [hidden email]
>>>>> <mailto:[hidden email]>
>>>>>
>>>>> wrote:
>>>>>
>>>>> On Tue, Mar 14, 2017 at 12:46 AM,
>>>>> Alexey Goncharuk <
>>>>> [hidden email]
>>>>> <mailto:[hidden email]>>
>>>>> wrote:
>>>>>
>>>>> Which user operation would
>>>>> result in exception? To my
>>>>> knowledge,
>>>>>
>>>>> user
>>>>>
>>>>> may
>>>>>
>>>>> already be holding the lock
>>>>> and not invoking any Ignite
>>>>> APIs, no?
>>>>>
>>>>> Yes, this is exactly my point.
>>>>>
>>>>> Imagine that a node already holds
>>>>> a lock and another node is waiting
>>>>>
>>>>> for
>>>>>
>>>>> the lock. If all partition nodes
>>>>> leave the grid and the lock is
>>>>>
>>>>> re-created,
>>>>>
>>>>> this second node will immediately
>>>>> acquire the lock and we will have
>>>>>
>>>>> two
>>>>>
>>>>> lock owners. I think in this case
>>>>> this second node (blocked on
>>>>>
>>>>> lock())
>>>>>
>>>>> should get an exception saying
>>>>> that the lock was lost (which is, by
>>>>>
>>>>> the
>>>>>
>>>>> way, the current behavior), and
>>>>> the first node should get an
>>>>>
>>>>> exception
>>>>>
>>>>> on
>>>>>
>>>>> unlock.
>>>>>
>>>>> Makes sense.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>

Vladisav Jelisavcic

Re: IgniteSemaphore and failoverSafe flag

Hmm, I cannot reproduce this behavior locally,
my guess is interrupt flag is not always cleared properly in
#GridCacheSemaphore.acquire method (but it doesn't have anything to do with
latest fix)

Can you make it reproducible?

On Fri, Apr 14, 2017 at 2:46 PM, Dmitry Karachentsev <
[hidden email]> wrote:

> Vladislav,
>
> One more thing, This test [1] started failing on semaphore close when this
> fix [2] was introduced.
> Could you check it please?
>
> [1] http://ci.ignite.apache.org/viewLog.html?buildId=547151&
> tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#
> testNameId-979977708202725050
> [2] https://issues.apache.org/jira/browse/IGNITE-1977
>
> Thanks!
>
> 14.04.2017 15:27, Dmitry Karachentsev пишет:
>
> Vladislav,
>
> Yep, you're right. I'll fix it.
>
> Thanks!
>
> 14.04.2017 15:18, Vladisav Jelisavcic пишет:
>
> Hi Dmitry,
>
> it looks to me that this test is not valid - after the semaphore 2 fails
> the permits are redistributed
> so the expected number of permits should really be 20 not 10. Do you agree?
>
> I guess before latest fix this test was (incorrectly) passing because
> permits weren't released properly.
>
> What do you think?
>
> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev <
> [hidden email]> wrote:
>
>> Hi Vladislav,
>>
>> It looks like after fix was merged these tests [1] started failing. Could
>> you please take a look?
>>
>> [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=
>> buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObject
>> sDataStrucutures
>>
>> Thanks!
>>
>> -Dmitry.
>>
>> 13.04.2017 16:15, Dmitry Karachentsev пишет:
>>
>> Thanks a lot!
>>
>> 12.04.2017 16:35, Vladisav Jelisavcic пишет:
>>
>> Hi Dmitry,
>>
>> sure, I made a fix, take a look at the PR and the comments in the ticket.
>>
>> Best regards,
>> Vladisav
>>
>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev <
>> [hidden email]> wrote:
>>
>>> Hi Vladislav,
>>>
>>> Thanks for your contribution! But it seems doesn't fix related tickets,
>>> in particular [1].
>>> Could you please take a look?
>>>
>>> [1] https://issues.apache.org/jira/browse/IGNITE-4173
>>>
>>> Thanks!
>>>
>>> 06.04.2017 16:27, Vladisav Jelisavcic пишет:
>>>
>>> Hey Dmitry,
>>>
>>> sorry for the late reply, I'll try to bake a pr later during the day.
>>>
>>> Best regards,
>>> Vladisav
>>>
>>>
>>>
>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev <
>>> [hidden email]> wrote:
>>>
>>>> Hi Vladislav,
>>>>
>>>> I see you're developing [1] for a while, did you have any chance to fix
>>>> it? If no, is there any estimate?
>>>>
>>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977
>>>>
>>>> Thanks!
>>>>
>>>> -Dmitry.
>>>>
>>>>
>>>>
>>>> 20.03.2017 10:28, Alexey Goncharuk пишет:
>>>>
>>>> I think re-creation should be handled by a user who will make sure that
>>>>> nobody else is currently executing the guarded logic before the
>>>>> re-creation. This is exactly the same semantics as with
>>>>> BrokenBarrierException for j.u.c.CyclicBarrier.
>>>>>
>>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>:
>>>>>
>>>>> Hi everyone,
>>>>>>
>>>>>> I agree with Val, he's got a point; recreating the lock doesn't seem
>>>>>> possible
>>>>>> (at least not the with the transactional cache lock/semaphore we
>>>>>> have).
>>>>>> Is this re-create behavior really needed?
>>>>>>
>>>>>> Best regards,
>>>>>> Vladisav
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko <
>>>>>> [hidden email]> wrote:
>>>>>>
>>>>>> Guys,
>>>>>>>
>>>>>>> How does recreation of the lock helps? My understanding is that
>>>>>>> scenario
>>>>>>>
>>>>>> is
>>>>>>
>>>>>>> the following:
>>>>>>>
>>>>>>> 1. Client A creates and acquires a lock, and then starts to execute
>>>>>>>
>>>>>> guarded
>>>>>>
>>>>>>> logic.
>>>>>>> 2. Client B tries to acquire the same lock and parks to wait.
>>>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail,
>>>>>>> lock
>>>>>>> disappears from the cache.
>>>>>>> 4. Client B fails with exception, recreates the lock, acquires it,
>>>>>>> and
>>>>>>> starts to execute guarded logic concurrently with client A.
>>>>>>>
>>>>>>> In my view this is wrong anyway, regardless of whether this happens
>>>>>>> silently or with an exception handled in user's code. Because this
>>>>>>> code
>>>>>>> doesn't have any way to know if client A still holds the lock or not.
>>>>>>>
>>>>>>> Am I missing something?
>>>>>>>
>>>>>>> -Val
>>>>>>>
>>>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <
>>>>>>>
>>>>>> [hidden email]
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
>>>>>>>> [hidden email]> wrote:
>>>>>>>>
>>>>>>>> Which user operation would result in exception? To my knowledge,
>>>>>>>>>>
>>>>>>>>> user
>>>>>>
>>>>>>> may
>>>>>>>>
>>>>>>>>> already be holding the lock and not invoking any Ignite APIs, no?
>>>>>>>>>>
>>>>>>>>>> Yes, this is exactly my point.
>>>>>>>>>
>>>>>>>>> Imagine that a node already holds a lock and another node is
>>>>>>>>> waiting
>>>>>>>>>
>>>>>>>> for
>>>>>>>
>>>>>>>> the lock. If all partition nodes leave the grid and the lock is
>>>>>>>>>
>>>>>>>> re-created,
>>>>>>>>
>>>>>>>>> this second node will immediately acquire the lock and we will have
>>>>>>>>>
>>>>>>>> two
>>>>>>
>>>>>>> lock owners. I think in this case this second node (blocked on
>>>>>>>>>
>>>>>>>> lock())
>>>>>>
>>>>>>> should get an exception saying that the lock was lost (which is, by
>>>>>>>>>
>>>>>>>> the
>>>>>>
>>>>>>> way, the current behavior), and the first node should get an
>>>>>>>>>
>>>>>>>> exception
>>>>>>
>>>>>>> on
>>>>>>>
>>>>>>>> unlock.
>>>>>>>>>
>>>>>>>>> Makes sense.
>>>>>>>>
>>>>>>>>
>>>>
>>>
>>>
>>
>>
>>
>
>
>