Hi all!
Guys, could somebody explain semantic of failoverSafe flag in IgniteSemaphore. From my point of view the test below should work but it fails: public void testFailoverReleasePermits() throws Exception { Ignite ignite = grid(0); IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true); sem.acquire(1); ignite.close(); U.sleep(5000); ignite = grid(1); sem = ignite.semaphore("sem", 1, true, true); boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS); assertTrue(acquire); // fails here } From my point of view permit should be available after the first ignite instance left topology. |
Hi,
when failoverSafe == true, semaphore should silently redistribute the permits acquired on the failing node. If failoverSafe is set to false, exception is thrown to every node attempting to acquire. It seems to me that when the first instance left topology, no backups were available (this is similar to: https://issues.apache.org/jira/browse/IGNITE-3386). This should be fixed (semaphore should be recreated when create==true, as suggested by Denis in the ticket). It should be a minor fix, will be ready for 1.8. Best regards, Vladisav On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote: > Hi all! > > Guys, could somebody explain semantic of failoverSafe flag in > IgniteSemaphore. From my point of view the test below should work but it > fails: > > public void testFailoverReleasePermits() throws Exception { > Ignite ignite = grid(0); > > IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true); > > sem.acquire(1); > > ignite.close(); > > U.sleep(5000); > > ignite = grid(1); > > sem = ignite.semaphore("sem", 1, true, true); > > boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS); > > assertTrue(acquire); // fails here > } > > From my point of view permit should be available after the first ignite > instance left topology. > |
Vladisav,
I've ran this test with partitioned cache and 1 backup and with replicated cache (4 nodes in topology). Behavior is the same. I think it is bug. But the first I wanted make sure that I understand failoverSafe flag correctly. Thank you for reply. I'll create ticket. On Tue, Nov 1, 2016 at 8:48 PM, Vladisav Jelisavcic <[hidden email]> wrote: > Hi, > > when failoverSafe == true, semaphore should silently redistribute the > permits acquired on the failing node. > If failoverSafe is set to false, exception is thrown to every node > attempting to acquire. > > It seems to me that when the first instance left topology, > no backups were available (this is similar to: > https://issues.apache.org/jira/browse/IGNITE-3386). > This should be fixed (semaphore should be recreated when create==true, as > suggested by Denis in the ticket). > > It should be a minor fix, will be ready for 1.8. > > Best regards, > Vladisav > > > > > > > > > On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote: > > > Hi all! > > > > Guys, could somebody explain semantic of failoverSafe flag in > > IgniteSemaphore. From my point of view the test below should work but it > > fails: > > > > public void testFailoverReleasePermits() throws Exception { > > Ignite ignite = grid(0); > > > > IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true); > > > > sem.acquire(1); > > > > ignite.close(); > > > > U.sleep(5000); > > > > ignite = grid(1); > > > > sem = ignite.semaphore("sem", 1, true, true); > > > > boolean acquire = sem.tryAcquire(1, 5000, TimeUnit.MILLISECONDS); > > > > assertTrue(acquire); // fails here > > } > > > > From my point of view permit should be available after the first ignite > > instance left topology. > > > |
Guys,
I was looking at this ticket and have a question related to the lock semantics: Suppose I have a node which has already acquired the lock, and then all affinity nodes related to the lock leave topology. In this case, if we automatically re-created the lock, we would end up having two lock owners in the grid, which is unacceptable. I think throwing an exception and forcing a user to re-create the lock by himself is a correct way to resolve this. 2016-11-02 14:36 GMT+03:00 Andrey Gura <[hidden email]>: > Vladisav, > > I've ran this test with partitioned cache and 1 backup and with replicated > cache (4 nodes in topology). Behavior is the same. I think it is bug. But > the first I wanted make sure that I understand failoverSafe flag correctly. > > Thank you for reply. I'll create ticket. > > On Tue, Nov 1, 2016 at 8:48 PM, Vladisav Jelisavcic <[hidden email]> > wrote: > > > Hi, > > > > when failoverSafe == true, semaphore should silently redistribute the > > permits acquired on the failing node. > > If failoverSafe is set to false, exception is thrown to every node > > attempting to acquire. > > > > It seems to me that when the first instance left topology, > > no backups were available (this is similar to: > > https://issues.apache.org/jira/browse/IGNITE-3386). > > This should be fixed (semaphore should be recreated when create==true, as > > suggested by Denis in the ticket). > > > > It should be a minor fix, will be ready for 1.8. > > > > Best regards, > > Vladisav > > > > > > > > > > > > > > > > > > On Tue, Nov 1, 2016 at 5:41 PM, Andrey Gura <[hidden email]> wrote: > > > > > Hi all! > > > > > > Guys, could somebody explain semantic of failoverSafe flag in > > > IgniteSemaphore. From my point of view the test below should work but > it > > > fails: > > > > > > public void testFailoverReleasePermits() throws Exception { > > > Ignite ignite = grid(0); > > > > > > IgniteSemaphore sem = ignite.semaphore("sem", 1, true, true); > > > > > > sem.acquire(1); > > > > > > ignite.close(); > > > > > > U.sleep(5000); > > > > > > ignite = grid(1); > > > > > > sem = ignite.semaphore("sem", 1, true, true); > > > > > > boolean acquire = sem.tryAcquire(1, 5000, > TimeUnit.MILLISECONDS); > > > > > > assertTrue(acquire); // fails here > > > } > > > > > > From my point of view permit should be available after the first ignite > > > instance left topology. > > > > > > |
On Tue, Mar 7, 2017 at 1:26 AM, Alexey Goncharuk <[hidden email]
> wrote: > Guys, > > I was looking at this ticket and have a question related to the lock > semantics: > > Suppose I have a node which has already acquired the lock, and then all > affinity nodes related to the lock leave topology. In this case, if we > automatically re-created the lock, we would end up having two lock owners > in the grid, which is unacceptable. > > I think throwing an exception and forcing a user to re-create the lock by > himself is a correct way to resolve this. > Which user operation would result in exception? To my knowledge, user may already be holding the lock and not invoking any Ignite APIs, no? |
>
> Which user operation would result in exception? To my knowledge, user may > already be holding the lock and not invoking any Ignite APIs, no? > Yes, this is exactly my point. Imagine that a node already holds a lock and another node is waiting for the lock. If all partition nodes leave the grid and the lock is re-created, this second node will immediately acquire the lock and we will have two lock owners. I think in this case this second node (blocked on lock()) should get an exception saying that the lock was lost (which is, by the way, the current behavior), and the first node should get an exception on unlock. |
On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk <
[hidden email]> wrote: > > > > Which user operation would result in exception? To my knowledge, user may > > already be holding the lock and not invoking any Ignite APIs, no? > > > > Yes, this is exactly my point. > > Imagine that a node already holds a lock and another node is waiting for > the lock. If all partition nodes leave the grid and the lock is re-created, > this second node will immediately acquire the lock and we will have two > lock owners. I think in this case this second node (blocked on lock()) > should get an exception saying that the lock was lost (which is, by the > way, the current behavior), and the first node should get an exception on > unlock. > Makes sense. |
Guys,
How does recreation of the lock helps? My understanding is that scenario is the following: 1. Client A creates and acquires a lock, and then starts to execute guarded logic. 2. Client B tries to acquire the same lock and parks to wait. 3. Before client A unlocks, all affinity nodes for the lock fail, lock disappears from the cache. 4. Client B fails with exception, recreates the lock, acquires it, and starts to execute guarded logic concurrently with client A. In my view this is wrong anyway, regardless of whether this happens silently or with an exception handled in user's code. Because this code doesn't have any way to know if client A still holds the lock or not. Am I missing something? -Val On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <[hidden email]> wrote: > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < > [hidden email]> wrote: > > > > > > > Which user operation would result in exception? To my knowledge, user > may > > > already be holding the lock and not invoking any Ignite APIs, no? > > > > > > > Yes, this is exactly my point. > > > > Imagine that a node already holds a lock and another node is waiting for > > the lock. If all partition nodes leave the grid and the lock is > re-created, > > this second node will immediately acquire the lock and we will have two > > lock owners. I think in this case this second node (blocked on lock()) > > should get an exception saying that the lock was lost (which is, by the > > way, the current behavior), and the first node should get an exception on > > unlock. > > > > Makes sense. > |
Hi everyone,
I agree with Val, he's got a point; recreating the lock doesn't seem possible (at least not the with the transactional cache lock/semaphore we have). Is this re-create behavior really needed? Best regards, Vladisav On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < [hidden email]> wrote: > Guys, > > How does recreation of the lock helps? My understanding is that scenario is > the following: > > 1. Client A creates and acquires a lock, and then starts to execute guarded > logic. > 2. Client B tries to acquire the same lock and parks to wait. > 3. Before client A unlocks, all affinity nodes for the lock fail, lock > disappears from the cache. > 4. Client B fails with exception, recreates the lock, acquires it, and > starts to execute guarded logic concurrently with client A. > > In my view this is wrong anyway, regardless of whether this happens > silently or with an exception handled in user's code. Because this code > doesn't have any way to know if client A still holds the lock or not. > > Am I missing something? > > -Val > > On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan <[hidden email] > > > wrote: > > > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < > > [hidden email]> wrote: > > > > > > > > > > Which user operation would result in exception? To my knowledge, user > > may > > > > already be holding the lock and not invoking any Ignite APIs, no? > > > > > > > > > > Yes, this is exactly my point. > > > > > > Imagine that a node already holds a lock and another node is waiting > for > > > the lock. If all partition nodes leave the grid and the lock is > > re-created, > > > this second node will immediately acquire the lock and we will have two > > > lock owners. I think in this case this second node (blocked on lock()) > > > should get an exception saying that the lock was lost (which is, by the > > > way, the current behavior), and the first node should get an exception > on > > > unlock. > > > > > > > Makes sense. > > > |
I think re-creation should be handled by a user who will make sure that
nobody else is currently executing the guarded logic before the re-creation. This is exactly the same semantics as with BrokenBarrierException for j.u.c.CyclicBarrier. 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: > Hi everyone, > > I agree with Val, he's got a point; recreating the lock doesn't seem > possible > (at least not the with the transactional cache lock/semaphore we have). > Is this re-create behavior really needed? > > Best regards, > Vladisav > > > > On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < > [hidden email]> wrote: > > > Guys, > > > > How does recreation of the lock helps? My understanding is that scenario > is > > the following: > > > > 1. Client A creates and acquires a lock, and then starts to execute > guarded > > logic. > > 2. Client B tries to acquire the same lock and parks to wait. > > 3. Before client A unlocks, all affinity nodes for the lock fail, lock > > disappears from the cache. > > 4. Client B fails with exception, recreates the lock, acquires it, and > > starts to execute guarded logic concurrently with client A. > > > > In my view this is wrong anyway, regardless of whether this happens > > silently or with an exception handled in user's code. Because this code > > doesn't have any way to know if client A still holds the lock or not. > > > > Am I missing something? > > > > -Val > > > > On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < > [hidden email] > > > > > wrote: > > > > > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < > > > [hidden email]> wrote: > > > > > > > > > > > > > Which user operation would result in exception? To my knowledge, > user > > > may > > > > > already be holding the lock and not invoking any Ignite APIs, no? > > > > > > > > > > > > > Yes, this is exactly my point. > > > > > > > > Imagine that a node already holds a lock and another node is waiting > > for > > > > the lock. If all partition nodes leave the grid and the lock is > > > re-created, > > > > this second node will immediately acquire the lock and we will have > two > > > > lock owners. I think in this case this second node (blocked on > lock()) > > > > should get an exception saying that the lock was lost (which is, by > the > > > > way, the current behavior), and the first node should get an > exception > > on > > > > unlock. > > > > > > > > > > Makes sense. > > > > > > |
Hi Vladislav,
I see you're developing [1] for a while, did you have any chance to fix it? If no, is there any estimate? [1] https://issues.apache.org/jira/browse/IGNITE-1977 Thanks! -Dmitry. 20.03.2017 10:28, Alexey Goncharuk пишет: > I think re-creation should be handled by a user who will make sure that > nobody else is currently executing the guarded logic before the > re-creation. This is exactly the same semantics as with > BrokenBarrierException for j.u.c.CyclicBarrier. > > 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: > >> Hi everyone, >> >> I agree with Val, he's got a point; recreating the lock doesn't seem >> possible >> (at least not the with the transactional cache lock/semaphore we have). >> Is this re-create behavior really needed? >> >> Best regards, >> Vladisav >> >> >> >> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >> [hidden email]> wrote: >> >>> Guys, >>> >>> How does recreation of the lock helps? My understanding is that scenario >> is >>> the following: >>> >>> 1. Client A creates and acquires a lock, and then starts to execute >> guarded >>> logic. >>> 2. Client B tries to acquire the same lock and parks to wait. >>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock >>> disappears from the cache. >>> 4. Client B fails with exception, recreates the lock, acquires it, and >>> starts to execute guarded logic concurrently with client A. >>> >>> In my view this is wrong anyway, regardless of whether this happens >>> silently or with an exception handled in user's code. Because this code >>> doesn't have any way to know if client A still holds the lock or not. >>> >>> Am I missing something? >>> >>> -Val >>> >>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >> [hidden email] >>> wrote: >>> >>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < >>>> [hidden email]> wrote: >>>> >>>>>> Which user operation would result in exception? To my knowledge, >> user >>>> may >>>>>> already be holding the lock and not invoking any Ignite APIs, no? >>>>>> >>>>> Yes, this is exactly my point. >>>>> >>>>> Imagine that a node already holds a lock and another node is waiting >>> for >>>>> the lock. If all partition nodes leave the grid and the lock is >>>> re-created, >>>>> this second node will immediately acquire the lock and we will have >> two >>>>> lock owners. I think in this case this second node (blocked on >> lock()) >>>>> should get an exception saying that the lock was lost (which is, by >> the >>>>> way, the current behavior), and the first node should get an >> exception >>> on >>>>> unlock. >>>>> >>>> Makes sense. >>>> |
Hey Dmitry,
sorry for the late reply, I'll try to bake a pr later during the day. Best regards, Vladisav On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev < [hidden email]> wrote: > Hi Vladislav, > > I see you're developing [1] for a while, did you have any chance to fix > it? If no, is there any estimate? > > [1] https://issues.apache.org/jira/browse/IGNITE-1977 > > Thanks! > > -Dmitry. > > > > 20.03.2017 10:28, Alexey Goncharuk пишет: > > I think re-creation should be handled by a user who will make sure that >> nobody else is currently executing the guarded logic before the >> re-creation. This is exactly the same semantics as with >> BrokenBarrierException for j.u.c.CyclicBarrier. >> >> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: >> >> Hi everyone, >>> >>> I agree with Val, he's got a point; recreating the lock doesn't seem >>> possible >>> (at least not the with the transactional cache lock/semaphore we have). >>> Is this re-create behavior really needed? >>> >>> Best regards, >>> Vladisav >>> >>> >>> >>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >>> [hidden email]> wrote: >>> >>> Guys, >>>> >>>> How does recreation of the lock helps? My understanding is that scenario >>>> >>> is >>> >>>> the following: >>>> >>>> 1. Client A creates and acquires a lock, and then starts to execute >>>> >>> guarded >>> >>>> logic. >>>> 2. Client B tries to acquire the same lock and parks to wait. >>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock >>>> disappears from the cache. >>>> 4. Client B fails with exception, recreates the lock, acquires it, and >>>> starts to execute guarded logic concurrently with client A. >>>> >>>> In my view this is wrong anyway, regardless of whether this happens >>>> silently or with an exception handled in user's code. Because this code >>>> doesn't have any way to know if client A still holds the lock or not. >>>> >>>> Am I missing something? >>>> >>>> -Val >>>> >>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >>>> >>> [hidden email] >>> >>>> wrote: >>>> >>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < >>>>> [hidden email]> wrote: >>>>> >>>>> Which user operation would result in exception? To my knowledge, >>>>>>> >>>>>> user >>> >>>> may >>>>> >>>>>> already be holding the lock and not invoking any Ignite APIs, no? >>>>>>> >>>>>>> Yes, this is exactly my point. >>>>>> >>>>>> Imagine that a node already holds a lock and another node is waiting >>>>>> >>>>> for >>>> >>>>> the lock. If all partition nodes leave the grid and the lock is >>>>>> >>>>> re-created, >>>>> >>>>>> this second node will immediately acquire the lock and we will have >>>>>> >>>>> two >>> >>>> lock owners. I think in this case this second node (blocked on >>>>>> >>>>> lock()) >>> >>>> should get an exception saying that the lock was lost (which is, by >>>>>> >>>>> the >>> >>>> way, the current behavior), and the first node should get an >>>>>> >>>>> exception >>> >>>> on >>>> >>>>> unlock. >>>>>> >>>>>> Makes sense. >>>>> >>>>> > |
Hi Vladislav,
Thanks for your contribution! But it seems doesn't fix related tickets, in particular [1]. Could you please take a look? [1] https://issues.apache.org/jira/browse/IGNITE-4173 Thanks! 06.04.2017 16:27, Vladisav Jelisavcic пишет: > Hey Dmitry, > > sorry for the late reply, I'll try to bake a pr later during the day. > > Best regards, > Vladisav > > > > On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev > <[hidden email] <mailto:[hidden email]>> wrote: > > Hi Vladislav, > > I see you're developing [1] for a while, did you have any chance > to fix it? If no, is there any estimate? > > [1] https://issues.apache.org/jira/browse/IGNITE-1977 > <https://issues.apache.org/jira/browse/IGNITE-1977> > > Thanks! > > -Dmitry. > > > > 20.03.2017 10:28, Alexey Goncharuk пишет: > > I think re-creation should be handled by a user who will make > sure that > nobody else is currently executing the guarded logic before the > re-creation. This is exactly the same semantics as with > BrokenBarrierException for j.u.c.CyclicBarrier. > > 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic > <[hidden email] <mailto:[hidden email]>>: > > Hi everyone, > > I agree with Val, he's got a point; recreating the lock > doesn't seem > possible > (at least not the with the transactional cache > lock/semaphore we have). > Is this re-create behavior really needed? > > Best regards, > Vladisav > > > > On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < > [hidden email] > <mailto:[hidden email]>> wrote: > > Guys, > > How does recreation of the lock helps? My > understanding is that scenario > > is > > the following: > > 1. Client A creates and acquires a lock, and then > starts to execute > > guarded > > logic. > 2. Client B tries to acquire the same lock and parks > to wait. > 3. Before client A unlocks, all affinity nodes for the > lock fail, lock > disappears from the cache. > 4. Client B fails with exception, recreates the lock, > acquires it, and > starts to execute guarded logic concurrently with > client A. > > In my view this is wrong anyway, regardless of whether > this happens > silently or with an exception handled in user's code. > Because this code > doesn't have any way to know if client A still holds > the lock or not. > > Am I missing something? > > -Val > > On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < > > [hidden email] <mailto:[hidden email]> > > wrote: > > On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < > [hidden email] > <mailto:[hidden email]>> wrote: > > Which user operation would result in > exception? To my knowledge, > > user > > may > > already be holding the lock and not > invoking any Ignite APIs, no? > > Yes, this is exactly my point. > > Imagine that a node already holds a lock and > another node is waiting > > for > > the lock. If all partition nodes leave the > grid and the lock is > > re-created, > > this second node will immediately acquire the > lock and we will have > > two > > lock owners. I think in this case this second > node (blocked on > > lock()) > > should get an exception saying that the lock > was lost (which is, by > > the > > way, the current behavior), and the first node > should get an > > exception > > on > > unlock. > > Makes sense. > > > |
Hi Dmitry,
sure, I made a fix, take a look at the PR and the comments in the ticket. Best regards, Vladisav On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev < [hidden email]> wrote: > Hi Vladislav, > > Thanks for your contribution! But it seems doesn't fix related tickets, in > particular [1]. > Could you please take a look? > > [1] https://issues.apache.org/jira/browse/IGNITE-4173 > > Thanks! > > 06.04.2017 16:27, Vladisav Jelisavcic пишет: > > Hey Dmitry, > > sorry for the late reply, I'll try to bake a pr later during the day. > > Best regards, > Vladisav > > > > On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev < > [hidden email]> wrote: > >> Hi Vladislav, >> >> I see you're developing [1] for a while, did you have any chance to fix >> it? If no, is there any estimate? >> >> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >> >> Thanks! >> >> -Dmitry. >> >> >> >> 20.03.2017 10:28, Alexey Goncharuk пишет: >> >> I think re-creation should be handled by a user who will make sure that >>> nobody else is currently executing the guarded logic before the >>> re-creation. This is exactly the same semantics as with >>> BrokenBarrierException for j.u.c.CyclicBarrier. >>> >>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: >>> >>> Hi everyone, >>>> >>>> I agree with Val, he's got a point; recreating the lock doesn't seem >>>> possible >>>> (at least not the with the transactional cache lock/semaphore we have). >>>> Is this re-create behavior really needed? >>>> >>>> Best regards, >>>> Vladisav >>>> >>>> >>>> >>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >>>> [hidden email]> wrote: >>>> >>>> Guys, >>>>> >>>>> How does recreation of the lock helps? My understanding is that >>>>> scenario >>>>> >>>> is >>>> >>>>> the following: >>>>> >>>>> 1. Client A creates and acquires a lock, and then starts to execute >>>>> >>>> guarded >>>> >>>>> logic. >>>>> 2. Client B tries to acquire the same lock and parks to wait. >>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock >>>>> disappears from the cache. >>>>> 4. Client B fails with exception, recreates the lock, acquires it, and >>>>> starts to execute guarded logic concurrently with client A. >>>>> >>>>> In my view this is wrong anyway, regardless of whether this happens >>>>> silently or with an exception handled in user's code. Because this code >>>>> doesn't have any way to know if client A still holds the lock or not. >>>>> >>>>> Am I missing something? >>>>> >>>>> -Val >>>>> >>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >>>>> >>>> [hidden email] >>>> >>>>> wrote: >>>>> >>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < >>>>>> [hidden email]> wrote: >>>>>> >>>>>> Which user operation would result in exception? To my knowledge, >>>>>>>> >>>>>>> user >>>> >>>>> may >>>>>> >>>>>>> already be holding the lock and not invoking any Ignite APIs, no? >>>>>>>> >>>>>>>> Yes, this is exactly my point. >>>>>>> >>>>>>> Imagine that a node already holds a lock and another node is waiting >>>>>>> >>>>>> for >>>>> >>>>>> the lock. If all partition nodes leave the grid and the lock is >>>>>>> >>>>>> re-created, >>>>>> >>>>>>> this second node will immediately acquire the lock and we will have >>>>>>> >>>>>> two >>>> >>>>> lock owners. I think in this case this second node (blocked on >>>>>>> >>>>>> lock()) >>>> >>>>> should get an exception saying that the lock was lost (which is, by >>>>>>> >>>>>> the >>>> >>>>> way, the current behavior), and the first node should get an >>>>>>> >>>>>> exception >>>> >>>>> on >>>>> >>>>>> unlock. >>>>>>> >>>>>>> Makes sense. >>>>>> >>>>>> >> > > |
Thanks a lot!
12.04.2017 16:35, Vladisav Jelisavcic пишет: > Hi Dmitry, > > sure, I made a fix, take a look at the PR and the comments in the ticket. > > Best regards, > Vladisav > > On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev > <[hidden email] <mailto:[hidden email]>> wrote: > > Hi Vladislav, > > Thanks for your contribution! But it seems doesn't fix related > tickets, in particular [1]. > Could you please take a look? > > [1] https://issues.apache.org/jira/browse/IGNITE-4173 > <https://issues.apache.org/jira/browse/IGNITE-4173> > > Thanks! > > 06.04.2017 16:27, Vladisav Jelisavcic пишет: >> Hey Dmitry, >> >> sorry for the late reply, I'll try to bake a pr later during the day. >> >> Best regards, >> Vladisav >> >> >> >> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev >> <[hidden email] <mailto:[hidden email]>> >> wrote: >> >> Hi Vladislav, >> >> I see you're developing [1] for a while, did you have any >> chance to fix it? If no, is there any estimate? >> >> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >> <https://issues.apache.org/jira/browse/IGNITE-1977> >> >> Thanks! >> >> -Dmitry. >> >> >> >> 20.03.2017 10:28, Alexey Goncharuk пишет: >> >> I think re-creation should be handled by a user who will >> make sure that >> nobody else is currently executing the guarded logic >> before the >> re-creation. This is exactly the same semantics as with >> BrokenBarrierException for j.u.c.CyclicBarrier. >> >> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic >> <[hidden email] <mailto:[hidden email]>>: >> >> Hi everyone, >> >> I agree with Val, he's got a point; recreating the >> lock doesn't seem >> possible >> (at least not the with the transactional cache >> lock/semaphore we have). >> Is this re-create behavior really needed? >> >> Best regards, >> Vladisav >> >> >> >> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >> [hidden email] >> <mailto:[hidden email]>> wrote: >> >> Guys, >> >> How does recreation of the lock helps? My >> understanding is that scenario >> >> is >> >> the following: >> >> 1. Client A creates and acquires a lock, and then >> starts to execute >> >> guarded >> >> logic. >> 2. Client B tries to acquire the same lock and >> parks to wait. >> 3. Before client A unlocks, all affinity nodes >> for the lock fail, lock >> disappears from the cache. >> 4. Client B fails with exception, recreates the >> lock, acquires it, and >> starts to execute guarded logic concurrently with >> client A. >> >> In my view this is wrong anyway, regardless of >> whether this happens >> silently or with an exception handled in user's >> code. Because this code >> doesn't have any way to know if client A still >> holds the lock or not. >> >> Am I missing something? >> >> -Val >> >> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >> >> [hidden email] <mailto:[hidden email]> >> >> wrote: >> >> On Tue, Mar 14, 2017 at 12:46 AM, Alexey >> Goncharuk < >> [hidden email] >> <mailto:[hidden email]>> wrote: >> >> Which user operation would result in >> exception? To my knowledge, >> >> user >> >> may >> >> already be holding the lock and not >> invoking any Ignite APIs, no? >> >> Yes, this is exactly my point. >> >> Imagine that a node already holds a lock >> and another node is waiting >> >> for >> >> the lock. If all partition nodes leave >> the grid and the lock is >> >> re-created, >> >> this second node will immediately acquire >> the lock and we will have >> >> two >> >> lock owners. I think in this case this >> second node (blocked on >> >> lock()) >> >> should get an exception saying that the >> lock was lost (which is, by >> >> the >> >> way, the current behavior), and the first >> node should get an >> >> exception >> >> on >> >> unlock. >> >> Makes sense. >> >> >> > > |
Hi Vladislav,
It looks like after fix was merged these tests [1] started failing. Could you please take a look? [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures Thanks! -Dmitry. 13.04.2017 16:15, Dmitry Karachentsev пишет: > Thanks a lot! > > 12.04.2017 16:35, Vladisav Jelisavcic пишет: >> Hi Dmitry, >> >> sure, I made a fix, take a look at the PR and the comments in the ticket. >> >> Best regards, >> Vladisav >> >> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev >> <[hidden email] <mailto:[hidden email]>> wrote: >> >> Hi Vladislav, >> >> Thanks for your contribution! But it seems doesn't fix related >> tickets, in particular [1]. >> Could you please take a look? >> >> [1] https://issues.apache.org/jira/browse/IGNITE-4173 >> <https://issues.apache.org/jira/browse/IGNITE-4173> >> >> Thanks! >> >> 06.04.2017 16:27, Vladisav Jelisavcic пишет: >>> Hey Dmitry, >>> >>> sorry for the late reply, I'll try to bake a pr later during the >>> day. >>> >>> Best regards, >>> Vladisav >>> >>> >>> >>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev >>> <[hidden email] <mailto:[hidden email]>> >>> wrote: >>> >>> Hi Vladislav, >>> >>> I see you're developing [1] for a while, did you have any >>> chance to fix it? If no, is there any estimate? >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >>> <https://issues.apache.org/jira/browse/IGNITE-1977> >>> >>> Thanks! >>> >>> -Dmitry. >>> >>> >>> >>> 20.03.2017 10:28, Alexey Goncharuk пишет: >>> >>> I think re-creation should be handled by a user who will >>> make sure that >>> nobody else is currently executing the guarded logic >>> before the >>> re-creation. This is exactly the same semantics as with >>> BrokenBarrierException for j.u.c.CyclicBarrier. >>> >>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic >>> <[hidden email] <mailto:[hidden email]>>: >>> >>> Hi everyone, >>> >>> I agree with Val, he's got a point; recreating the >>> lock doesn't seem >>> possible >>> (at least not the with the transactional cache >>> lock/semaphore we have). >>> Is this re-create behavior really needed? >>> >>> Best regards, >>> Vladisav >>> >>> >>> >>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >>> [hidden email] >>> <mailto:[hidden email]>> wrote: >>> >>> Guys, >>> >>> How does recreation of the lock helps? My >>> understanding is that scenario >>> >>> is >>> >>> the following: >>> >>> 1. Client A creates and acquires a lock, and >>> then starts to execute >>> >>> guarded >>> >>> logic. >>> 2. Client B tries to acquire the same lock and >>> parks to wait. >>> 3. Before client A unlocks, all affinity nodes >>> for the lock fail, lock >>> disappears from the cache. >>> 4. Client B fails with exception, recreates the >>> lock, acquires it, and >>> starts to execute guarded logic concurrently >>> with client A. >>> >>> In my view this is wrong anyway, regardless of >>> whether this happens >>> silently or with an exception handled in user's >>> code. Because this code >>> doesn't have any way to know if client A still >>> holds the lock or not. >>> >>> Am I missing something? >>> >>> -Val >>> >>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy >>> Setrakyan < >>> >>> [hidden email] <mailto:[hidden email]> >>> >>> wrote: >>> >>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey >>> Goncharuk < >>> [hidden email] >>> <mailto:[hidden email]>> wrote: >>> >>> Which user operation would result in >>> exception? To my knowledge, >>> >>> user >>> >>> may >>> >>> already be holding the lock and not >>> invoking any Ignite APIs, no? >>> >>> Yes, this is exactly my point. >>> >>> Imagine that a node already holds a lock >>> and another node is waiting >>> >>> for >>> >>> the lock. If all partition nodes leave >>> the grid and the lock is >>> >>> re-created, >>> >>> this second node will immediately >>> acquire the lock and we will have >>> >>> two >>> >>> lock owners. I think in this case this >>> second node (blocked on >>> >>> lock()) >>> >>> should get an exception saying that the >>> lock was lost (which is, by >>> >>> the >>> >>> way, the current behavior), and the >>> first node should get an >>> >>> exception >>> >>> on >>> >>> unlock. >>> >>> Makes sense. >>> >>> >>> >> >> > |
Hi Dmitry,
it looks to me that this test is not valid - after the semaphore 2 fails the permits are redistributed so the expected number of permits should really be 20 not 10. Do you agree? I guess before latest fix this test was (incorrectly) passing because permits weren't released properly. What do you think? On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev < [hidden email]> wrote: > Hi Vladislav, > > It looks like after fix was merged these tests [1] started failing. Could > you please take a look? > > [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238& > tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucut > ures > > Thanks! > > -Dmitry. > > 13.04.2017 16:15, Dmitry Karachentsev пишет: > > Thanks a lot! > > 12.04.2017 16:35, Vladisav Jelisavcic пишет: > > Hi Dmitry, > > sure, I made a fix, take a look at the PR and the comments in the ticket. > > Best regards, > Vladisav > > On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev < > [hidden email]> wrote: > >> Hi Vladislav, >> >> Thanks for your contribution! But it seems doesn't fix related tickets, >> in particular [1]. >> Could you please take a look? >> >> [1] https://issues.apache.org/jira/browse/IGNITE-4173 >> >> Thanks! >> >> 06.04.2017 16:27, Vladisav Jelisavcic пишет: >> >> Hey Dmitry, >> >> sorry for the late reply, I'll try to bake a pr later during the day. >> >> Best regards, >> Vladisav >> >> >> >> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev < >> [hidden email]> wrote: >> >>> Hi Vladislav, >>> >>> I see you're developing [1] for a while, did you have any chance to fix >>> it? If no, is there any estimate? >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >>> >>> Thanks! >>> >>> -Dmitry. >>> >>> >>> >>> 20.03.2017 10:28, Alexey Goncharuk пишет: >>> >>> I think re-creation should be handled by a user who will make sure that >>>> nobody else is currently executing the guarded logic before the >>>> re-creation. This is exactly the same semantics as with >>>> BrokenBarrierException for j.u.c.CyclicBarrier. >>>> >>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: >>>> >>>> Hi everyone, >>>>> >>>>> I agree with Val, he's got a point; recreating the lock doesn't seem >>>>> possible >>>>> (at least not the with the transactional cache lock/semaphore we have). >>>>> Is this re-create behavior really needed? >>>>> >>>>> Best regards, >>>>> Vladisav >>>>> >>>>> >>>>> >>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >>>>> [hidden email]> wrote: >>>>> >>>>> Guys, >>>>>> >>>>>> How does recreation of the lock helps? My understanding is that >>>>>> scenario >>>>>> >>>>> is >>>>> >>>>>> the following: >>>>>> >>>>>> 1. Client A creates and acquires a lock, and then starts to execute >>>>>> >>>>> guarded >>>>> >>>>>> logic. >>>>>> 2. Client B tries to acquire the same lock and parks to wait. >>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, lock >>>>>> disappears from the cache. >>>>>> 4. Client B fails with exception, recreates the lock, acquires it, and >>>>>> starts to execute guarded logic concurrently with client A. >>>>>> >>>>>> In my view this is wrong anyway, regardless of whether this happens >>>>>> silently or with an exception handled in user's code. Because this >>>>>> code >>>>>> doesn't have any way to know if client A still holds the lock or not. >>>>>> >>>>>> Am I missing something? >>>>>> >>>>>> -Val >>>>>> >>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >>>>>> >>>>> [hidden email] >>>>> >>>>>> wrote: >>>>>> >>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < >>>>>>> [hidden email]> wrote: >>>>>>> >>>>>>> Which user operation would result in exception? To my knowledge, >>>>>>>>> >>>>>>>> user >>>>> >>>>>> may >>>>>>> >>>>>>>> already be holding the lock and not invoking any Ignite APIs, no? >>>>>>>>> >>>>>>>>> Yes, this is exactly my point. >>>>>>>> >>>>>>>> Imagine that a node already holds a lock and another node is waiting >>>>>>>> >>>>>>> for >>>>>> >>>>>>> the lock. If all partition nodes leave the grid and the lock is >>>>>>>> >>>>>>> re-created, >>>>>>> >>>>>>>> this second node will immediately acquire the lock and we will have >>>>>>>> >>>>>>> two >>>>> >>>>>> lock owners. I think in this case this second node (blocked on >>>>>>>> >>>>>>> lock()) >>>>> >>>>>> should get an exception saying that the lock was lost (which is, by >>>>>>>> >>>>>>> the >>>>> >>>>>> way, the current behavior), and the first node should get an >>>>>>>> >>>>>>> exception >>>>> >>>>>> on >>>>>> >>>>>>> unlock. >>>>>>>> >>>>>>>> Makes sense. >>>>>>> >>>>>>> >>> >> >> > > > |
Vladislav,
Yep, you're right. I'll fix it. Thanks! 14.04.2017 15:18, Vladisav Jelisavcic пишет: > Hi Dmitry, > > it looks to me that this test is not valid - after the semaphore 2 > fails the permits are redistributed > so the expected number of permits should really be 20 not 10. Do you > agree? > > I guess before latest fix this test was (incorrectly) passing because > permits weren't released properly. > > What do you think? > > On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev > <[hidden email] <mailto:[hidden email]>> wrote: > > Hi Vladislav, > > It looks like after fix was merged these tests [1] started > failing. Could you please take a look? > > [1] > http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures > <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures> > > Thanks! > > -Dmitry. > > 13.04.2017 16:15, Dmitry Karachentsev пишет: >> Thanks a lot! >> >> 12.04.2017 16:35, Vladisav Jelisavcic пишет: >>> Hi Dmitry, >>> >>> sure, I made a fix, take a look at the PR and the comments in >>> the ticket. >>> >>> Best regards, >>> Vladisav >>> >>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev >>> <[hidden email] <mailto:[hidden email]>> >>> wrote: >>> >>> Hi Vladislav, >>> >>> Thanks for your contribution! But it seems doesn't fix >>> related tickets, in particular [1]. >>> Could you please take a look? >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-4173 >>> <https://issues.apache.org/jira/browse/IGNITE-4173> >>> >>> Thanks! >>> >>> 06.04.2017 16:27, Vladisav Jelisavcic пишет: >>>> Hey Dmitry, >>>> >>>> sorry for the late reply, I'll try to bake a pr later >>>> during the day. >>>> >>>> Best regards, >>>> Vladisav >>>> >>>> >>>> >>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev >>>> <[hidden email] >>>> <mailto:[hidden email]>> wrote: >>>> >>>> Hi Vladislav, >>>> >>>> I see you're developing [1] for a while, did you have >>>> any chance to fix it? If no, is there any estimate? >>>> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >>>> <https://issues.apache.org/jira/browse/IGNITE-1977> >>>> >>>> Thanks! >>>> >>>> -Dmitry. >>>> >>>> >>>> >>>> 20.03.2017 10:28, Alexey Goncharuk пишет: >>>> >>>> I think re-creation should be handled by a user who >>>> will make sure that >>>> nobody else is currently executing the guarded >>>> logic before the >>>> re-creation. This is exactly the same semantics as with >>>> BrokenBarrierException for j.u.c.CyclicBarrier. >>>> >>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic >>>> <[hidden email] <mailto:[hidden email]>>: >>>> >>>> Hi everyone, >>>> >>>> I agree with Val, he's got a point; recreating >>>> the lock doesn't seem >>>> possible >>>> (at least not the with the transactional cache >>>> lock/semaphore we have). >>>> Is this re-create behavior really needed? >>>> >>>> Best regards, >>>> Vladisav >>>> >>>> >>>> >>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin >>>> Kulichenko < >>>> [hidden email] >>>> <mailto:[hidden email]>> wrote: >>>> >>>> Guys, >>>> >>>> How does recreation of the lock helps? My >>>> understanding is that scenario >>>> >>>> is >>>> >>>> the following: >>>> >>>> 1. Client A creates and acquires a lock, >>>> and then starts to execute >>>> >>>> guarded >>>> >>>> logic. >>>> 2. Client B tries to acquire the same lock >>>> and parks to wait. >>>> 3. Before client A unlocks, all affinity >>>> nodes for the lock fail, lock >>>> disappears from the cache. >>>> 4. Client B fails with exception, recreates >>>> the lock, acquires it, and >>>> starts to execute guarded logic >>>> concurrently with client A. >>>> >>>> In my view this is wrong anyway, regardless >>>> of whether this happens >>>> silently or with an exception handled in >>>> user's code. Because this code >>>> doesn't have any way to know if client A >>>> still holds the lock or not. >>>> >>>> Am I missing something? >>>> >>>> -Val >>>> >>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy >>>> Setrakyan < >>>> >>>> [hidden email] >>>> <mailto:[hidden email]> >>>> >>>> wrote: >>>> >>>> On Tue, Mar 14, 2017 at 12:46 AM, >>>> Alexey Goncharuk < >>>> [hidden email] >>>> <mailto:[hidden email]>> wrote: >>>> >>>> Which user operation would >>>> result in exception? To my >>>> knowledge, >>>> >>>> user >>>> >>>> may >>>> >>>> already be holding the lock and >>>> not invoking any Ignite APIs, no? >>>> >>>> Yes, this is exactly my point. >>>> >>>> Imagine that a node already holds a >>>> lock and another node is waiting >>>> >>>> for >>>> >>>> the lock. If all partition nodes >>>> leave the grid and the lock is >>>> >>>> re-created, >>>> >>>> this second node will immediately >>>> acquire the lock and we will have >>>> >>>> two >>>> >>>> lock owners. I think in this case >>>> this second node (blocked on >>>> >>>> lock()) >>>> >>>> should get an exception saying that >>>> the lock was lost (which is, by >>>> >>>> the >>>> >>>> way, the current behavior), and the >>>> first node should get an >>>> >>>> exception >>>> >>>> on >>>> >>>> unlock. >>>> >>>> Makes sense. >>>> >>>> >>>> >>> >>> >> > > |
Vladislav,
One more thing, This test [1] started failing on semaphore close when this fix [2] was introduced. Could you check it please? [1] http://ci.ignite.apache.org/viewLog.html?buildId=547151&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures#testNameId-979977708202725050 [2] https://issues.apache.org/jira/browse/IGNITE-1977 Thanks! 14.04.2017 15:27, Dmitry Karachentsev пишет: > Vladislav, > > Yep, you're right. I'll fix it. > > Thanks! > > 14.04.2017 15:18, Vladisav Jelisavcic пишет: >> Hi Dmitry, >> >> it looks to me that this test is not valid - after the semaphore 2 >> fails the permits are redistributed >> so the expected number of permits should really be 20 not 10. Do you >> agree? >> >> I guess before latest fix this test was (incorrectly) passing because >> permits weren't released properly. >> >> What do you think? >> >> On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev >> <[hidden email] <mailto:[hidden email]>> wrote: >> >> Hi Vladislav, >> >> It looks like after fix was merged these tests [1] started >> failing. Could you please take a look? >> >> [1] >> http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures >> <http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObjectsDataStrucutures> >> >> Thanks! >> >> -Dmitry. >> >> 13.04.2017 16:15, Dmitry Karachentsev пишет: >>> Thanks a lot! >>> >>> 12.04.2017 16:35, Vladisav Jelisavcic пишет: >>>> Hi Dmitry, >>>> >>>> sure, I made a fix, take a look at the PR and the comments in >>>> the ticket. >>>> >>>> Best regards, >>>> Vladisav >>>> >>>> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev >>>> <[hidden email] >>>> <mailto:[hidden email]>> wrote: >>>> >>>> Hi Vladislav, >>>> >>>> Thanks for your contribution! But it seems doesn't fix >>>> related tickets, in particular [1]. >>>> Could you please take a look? >>>> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-4173 >>>> <https://issues.apache.org/jira/browse/IGNITE-4173> >>>> >>>> Thanks! >>>> >>>> 06.04.2017 16:27, Vladisav Jelisavcic пишет: >>>>> Hey Dmitry, >>>>> >>>>> sorry for the late reply, I'll try to bake a pr later >>>>> during the day. >>>>> >>>>> Best regards, >>>>> Vladisav >>>>> >>>>> >>>>> >>>>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev >>>>> <[hidden email] >>>>> <mailto:[hidden email]>> wrote: >>>>> >>>>> Hi Vladislav, >>>>> >>>>> I see you're developing [1] for a while, did you have >>>>> any chance to fix it? If no, is there any estimate? >>>>> >>>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >>>>> <https://issues.apache.org/jira/browse/IGNITE-1977> >>>>> >>>>> Thanks! >>>>> >>>>> -Dmitry. >>>>> >>>>> >>>>> >>>>> 20.03.2017 10:28, Alexey Goncharuk пишет: >>>>> >>>>> I think re-creation should be handled by a user >>>>> who will make sure that >>>>> nobody else is currently executing the guarded >>>>> logic before the >>>>> re-creation. This is exactly the same semantics as >>>>> with >>>>> BrokenBarrierException for j.u.c.CyclicBarrier. >>>>> >>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic >>>>> <[hidden email] <mailto:[hidden email]>>: >>>>> >>>>> Hi everyone, >>>>> >>>>> I agree with Val, he's got a point; recreating >>>>> the lock doesn't seem >>>>> possible >>>>> (at least not the with the transactional cache >>>>> lock/semaphore we have). >>>>> Is this re-create behavior really needed? >>>>> >>>>> Best regards, >>>>> Vladisav >>>>> >>>>> >>>>> >>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin >>>>> Kulichenko < >>>>> [hidden email] >>>>> <mailto:[hidden email]>> wrote: >>>>> >>>>> Guys, >>>>> >>>>> How does recreation of the lock helps? My >>>>> understanding is that scenario >>>>> >>>>> is >>>>> >>>>> the following: >>>>> >>>>> 1. Client A creates and acquires a lock, >>>>> and then starts to execute >>>>> >>>>> guarded >>>>> >>>>> logic. >>>>> 2. Client B tries to acquire the same lock >>>>> and parks to wait. >>>>> 3. Before client A unlocks, all affinity >>>>> nodes for the lock fail, lock >>>>> disappears from the cache. >>>>> 4. Client B fails with exception, >>>>> recreates the lock, acquires it, and >>>>> starts to execute guarded logic >>>>> concurrently with client A. >>>>> >>>>> In my view this is wrong anyway, >>>>> regardless of whether this happens >>>>> silently or with an exception handled in >>>>> user's code. Because this code >>>>> doesn't have any way to know if client A >>>>> still holds the lock or not. >>>>> >>>>> Am I missing something? >>>>> >>>>> -Val >>>>> >>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy >>>>> Setrakyan < >>>>> >>>>> [hidden email] >>>>> <mailto:[hidden email]> >>>>> >>>>> wrote: >>>>> >>>>> On Tue, Mar 14, 2017 at 12:46 AM, >>>>> Alexey Goncharuk < >>>>> [hidden email] >>>>> <mailto:[hidden email]>> >>>>> wrote: >>>>> >>>>> Which user operation would >>>>> result in exception? To my >>>>> knowledge, >>>>> >>>>> user >>>>> >>>>> may >>>>> >>>>> already be holding the lock >>>>> and not invoking any Ignite >>>>> APIs, no? >>>>> >>>>> Yes, this is exactly my point. >>>>> >>>>> Imagine that a node already holds >>>>> a lock and another node is waiting >>>>> >>>>> for >>>>> >>>>> the lock. If all partition nodes >>>>> leave the grid and the lock is >>>>> >>>>> re-created, >>>>> >>>>> this second node will immediately >>>>> acquire the lock and we will have >>>>> >>>>> two >>>>> >>>>> lock owners. I think in this case >>>>> this second node (blocked on >>>>> >>>>> lock()) >>>>> >>>>> should get an exception saying >>>>> that the lock was lost (which is, by >>>>> >>>>> the >>>>> >>>>> way, the current behavior), and >>>>> the first node should get an >>>>> >>>>> exception >>>>> >>>>> on >>>>> >>>>> unlock. >>>>> >>>>> Makes sense. >>>>> >>>>> >>>>> >>>> >>>> >>> >> >> > |
Hmm, I cannot reproduce this behavior locally,
my guess is interrupt flag is not always cleared properly in #GridCacheSemaphore.acquire method (but it doesn't have anything to do with latest fix) Can you make it reproducible? On Fri, Apr 14, 2017 at 2:46 PM, Dmitry Karachentsev < [hidden email]> wrote: > Vladislav, > > One more thing, This test [1] started failing on semaphore close when this > fix [2] was introduced. > Could you check it please? > > [1] http://ci.ignite.apache.org/viewLog.html?buildId=547151& > tab=buildResultsDiv&buildTypeId=IgniteTests_IgniteDataStrucutures# > testNameId-979977708202725050 > [2] https://issues.apache.org/jira/browse/IGNITE-1977 > > Thanks! > > 14.04.2017 15:27, Dmitry Karachentsev пишет: > > Vladislav, > > Yep, you're right. I'll fix it. > > Thanks! > > 14.04.2017 15:18, Vladisav Jelisavcic пишет: > > Hi Dmitry, > > it looks to me that this test is not valid - after the semaphore 2 fails > the permits are redistributed > so the expected number of permits should really be 20 not 10. Do you agree? > > I guess before latest fix this test was (incorrectly) passing because > permits weren't released properly. > > What do you think? > > On Fri, Apr 14, 2017 at 11:27 AM, Dmitry Karachentsev < > [hidden email]> wrote: > >> Hi Vladislav, >> >> It looks like after fix was merged these tests [1] started failing. Could >> you please take a look? >> >> [1] http://ci.ignite.apache.org/viewLog.html?buildId=544238&tab= >> buildResultsDiv&buildTypeId=IgniteTests_IgniteBinaryObject >> sDataStrucutures >> >> Thanks! >> >> -Dmitry. >> >> 13.04.2017 16:15, Dmitry Karachentsev пишет: >> >> Thanks a lot! >> >> 12.04.2017 16:35, Vladisav Jelisavcic пишет: >> >> Hi Dmitry, >> >> sure, I made a fix, take a look at the PR and the comments in the ticket. >> >> Best regards, >> Vladisav >> >> On Tue, Apr 11, 2017 at 3:00 PM, Dmitry Karachentsev < >> [hidden email]> wrote: >> >>> Hi Vladislav, >>> >>> Thanks for your contribution! But it seems doesn't fix related tickets, >>> in particular [1]. >>> Could you please take a look? >>> >>> [1] https://issues.apache.org/jira/browse/IGNITE-4173 >>> >>> Thanks! >>> >>> 06.04.2017 16:27, Vladisav Jelisavcic пишет: >>> >>> Hey Dmitry, >>> >>> sorry for the late reply, I'll try to bake a pr later during the day. >>> >>> Best regards, >>> Vladisav >>> >>> >>> >>> On Tue, Apr 4, 2017 at 11:05 AM, Dmitry Karachentsev < >>> [hidden email]> wrote: >>> >>>> Hi Vladislav, >>>> >>>> I see you're developing [1] for a while, did you have any chance to fix >>>> it? If no, is there any estimate? >>>> >>>> [1] https://issues.apache.org/jira/browse/IGNITE-1977 >>>> >>>> Thanks! >>>> >>>> -Dmitry. >>>> >>>> >>>> >>>> 20.03.2017 10:28, Alexey Goncharuk пишет: >>>> >>>> I think re-creation should be handled by a user who will make sure that >>>>> nobody else is currently executing the guarded logic before the >>>>> re-creation. This is exactly the same semantics as with >>>>> BrokenBarrierException for j.u.c.CyclicBarrier. >>>>> >>>>> 2017-03-17 2:39 GMT+03:00 Vladisav Jelisavcic <[hidden email]>: >>>>> >>>>> Hi everyone, >>>>>> >>>>>> I agree with Val, he's got a point; recreating the lock doesn't seem >>>>>> possible >>>>>> (at least not the with the transactional cache lock/semaphore we >>>>>> have). >>>>>> Is this re-create behavior really needed? >>>>>> >>>>>> Best regards, >>>>>> Vladisav >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Mar 16, 2017 at 8:34 PM, Valentin Kulichenko < >>>>>> [hidden email]> wrote: >>>>>> >>>>>> Guys, >>>>>>> >>>>>>> How does recreation of the lock helps? My understanding is that >>>>>>> scenario >>>>>>> >>>>>> is >>>>>> >>>>>>> the following: >>>>>>> >>>>>>> 1. Client A creates and acquires a lock, and then starts to execute >>>>>>> >>>>>> guarded >>>>>> >>>>>>> logic. >>>>>>> 2. Client B tries to acquire the same lock and parks to wait. >>>>>>> 3. Before client A unlocks, all affinity nodes for the lock fail, >>>>>>> lock >>>>>>> disappears from the cache. >>>>>>> 4. Client B fails with exception, recreates the lock, acquires it, >>>>>>> and >>>>>>> starts to execute guarded logic concurrently with client A. >>>>>>> >>>>>>> In my view this is wrong anyway, regardless of whether this happens >>>>>>> silently or with an exception handled in user's code. Because this >>>>>>> code >>>>>>> doesn't have any way to know if client A still holds the lock or not. >>>>>>> >>>>>>> Am I missing something? >>>>>>> >>>>>>> -Val >>>>>>> >>>>>>> On Tue, Mar 14, 2017 at 10:14 AM, Dmitriy Setrakyan < >>>>>>> >>>>>> [hidden email] >>>>>> >>>>>>> wrote: >>>>>>> >>>>>>> On Tue, Mar 14, 2017 at 12:46 AM, Alexey Goncharuk < >>>>>>>> [hidden email]> wrote: >>>>>>>> >>>>>>>> Which user operation would result in exception? To my knowledge, >>>>>>>>>> >>>>>>>>> user >>>>>> >>>>>>> may >>>>>>>> >>>>>>>>> already be holding the lock and not invoking any Ignite APIs, no? >>>>>>>>>> >>>>>>>>>> Yes, this is exactly my point. >>>>>>>>> >>>>>>>>> Imagine that a node already holds a lock and another node is >>>>>>>>> waiting >>>>>>>>> >>>>>>>> for >>>>>>> >>>>>>>> the lock. If all partition nodes leave the grid and the lock is >>>>>>>>> >>>>>>>> re-created, >>>>>>>> >>>>>>>>> this second node will immediately acquire the lock and we will have >>>>>>>>> >>>>>>>> two >>>>>> >>>>>>> lock owners. I think in this case this second node (blocked on >>>>>>>>> >>>>>>>> lock()) >>>>>> >>>>>>> should get an exception saying that the lock was lost (which is, by >>>>>>>>> >>>>>>>> the >>>>>> >>>>>>> way, the current behavior), and the first node should get an >>>>>>>>> >>>>>>>> exception >>>>>> >>>>>>> on >>>>>>> >>>>>>>> unlock. >>>>>>>>> >>>>>>>>> Makes sense. >>>>>>>> >>>>>>>> >>>> >>> >>> >> >> >> > > > |
Free forum by Nabble | Edit this page |