GridSpinBusyLock performance

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

GridSpinBusyLock performance

Vladimir Ozerov
Igniters,

We have two pretty strange constructs: GridSpinReadWriteLock and base on it
GridSpinBusyLock.
As I understand it was an effort to create more performant RWLock than
ReentrantReadWriteLock
for cases when wrtie locks are very unlikely.

As busy lock concept is also used in some sensitive places of "platforms"
module, I measured performance of read lock-unlock cycles for both
ReentrantReadWriteLock
and GridSpinReadWriteLock using JMH.

Our implementation doesn't offer any perform benefits comparing to
ReentrantReadWriteLock, their performance are almost equal. This makes
sense, because essentailly both locks just CASes on a shared variable to
obtain the read lock. Looks like we can safely remove these "spinners" and
use ReentrantReadWriteLock instead.

More serious perfomance gain can be achieved if we stripe the lock (e.g.
like it is done in LongAdder) thus decreasing contention on shared
variables. Quick experiments shown 5x throughput increase for read
lock-unlock cycles when lock is striped.

Thoughts?

Vladimir.
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

dsetrakyan
I don't recall exactly, but from what I remember, there were other benefits
to the spin-lock approach. Don't we use some characteristics of this lock
to properly shut down the system?

D.

On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Igniters,
>
> We have two pretty strange constructs: GridSpinReadWriteLock and base on it
> GridSpinBusyLock.
> As I understand it was an effort to create more performant RWLock than
> ReentrantReadWriteLock
> for cases when wrtie locks are very unlikely.
>
> As busy lock concept is also used in some sensitive places of "platforms"
> module, I measured performance of read lock-unlock cycles for both
> ReentrantReadWriteLock
> and GridSpinReadWriteLock using JMH.
>
> Our implementation doesn't offer any perform benefits comparing to
> ReentrantReadWriteLock, their performance are almost equal. This makes
> sense, because essentailly both locks just CASes on a shared variable to
> obtain the read lock. Looks like we can safely remove these "spinners" and
> use ReentrantReadWriteLock instead.
>
> More serious perfomance gain can be achieved if we stripe the lock (e.g.
> like it is done in LongAdder) thus decreasing contention on shared
> variables. Quick experiments shown 5x throughput increase for read
> lock-unlock cycles when lock is striped.
>
> Thoughts?
>
> Vladimir.
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

Vladimir Ozerov
Exactly, we use it primarily as "busy" lock, i.e. lots of concurrent
readers with writer blocking everything on node stop. But it doesn't
outperform regular ReentrantReadWriteLock actually.

On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> I don't recall exactly, but from what I remember, there were other benefits
> to the spin-lock approach. Don't we use some characteristics of this lock
> to properly shut down the system?
>
> D.
>
> On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Igniters,
> >
> > We have two pretty strange constructs: GridSpinReadWriteLock and base on
> it
> > GridSpinBusyLock.
> > As I understand it was an effort to create more performant RWLock than
> > ReentrantReadWriteLock
> > for cases when wrtie locks are very unlikely.
> >
> > As busy lock concept is also used in some sensitive places of "platforms"
> > module, I measured performance of read lock-unlock cycles for both
> > ReentrantReadWriteLock
> > and GridSpinReadWriteLock using JMH.
> >
> > Our implementation doesn't offer any perform benefits comparing to
> > ReentrantReadWriteLock, their performance are almost equal. This makes
> > sense, because essentailly both locks just CASes on a shared variable to
> > obtain the read lock. Looks like we can safely remove these "spinners"
> and
> > use ReentrantReadWriteLock instead.
> >
> > More serious perfomance gain can be achieved if we stripe the lock (e.g.
> > like it is done in LongAdder) thus decreasing contention on shared
> > variables. Quick experiments shown 5x throughput increase for read
> > lock-unlock cycles when lock is striped.
> >
> > Thoughts?
> >
> > Vladimir.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

dsetrakyan
On Mon, Aug 31, 2015 at 11:03 AM, Vladimir Ozerov <[hidden email]>
wrote:

> Exactly, we use it primarily as "busy" lock, i.e. lots of concurrent
> readers with writer blocking everything on node stop. But it doesn't
> outperform regular ReentrantReadWriteLock actually.
>

I don't think this use-case is about performance. Yakov, can you jump in
and remind us why we use the spin locks on Ignite instance stop?


>
> On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> > I don't recall exactly, but from what I remember, there were other
> benefits
> > to the spin-lock approach. Don't we use some characteristics of this lock
> > to properly shut down the system?
> >
> > D.
> >
> > On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > Igniters,
> > >
> > > We have two pretty strange constructs: GridSpinReadWriteLock and base
> on
> > it
> > > GridSpinBusyLock.
> > > As I understand it was an effort to create more performant RWLock than
> > > ReentrantReadWriteLock
> > > for cases when wrtie locks are very unlikely.
> > >
> > > As busy lock concept is also used in some sensitive places of
> "platforms"
> > > module, I measured performance of read lock-unlock cycles for both
> > > ReentrantReadWriteLock
> > > and GridSpinReadWriteLock using JMH.
> > >
> > > Our implementation doesn't offer any perform benefits comparing to
> > > ReentrantReadWriteLock, their performance are almost equal. This makes
> > > sense, because essentailly both locks just CASes on a shared variable
> to
> > > obtain the read lock. Looks like we can safely remove these "spinners"
> > and
> > > use ReentrantReadWriteLock instead.
> > >
> > > More serious perfomance gain can be achieved if we stripe the lock
> (e.g.
> > > like it is done in LongAdder) thus decreasing contention on shared
> > > variables. Quick experiments shown 5x throughput increase for read
> > > lock-unlock cycles when lock is striped.
> > >
> > > Thoughts?
> > >
> > > Vladimir.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

Yakov Zhdanov-2
Vova, can you please share your benchmark? That was long time ago and I
recall that I got better results in more complex benchmarks than jmh one.

However, I am quiet excited with your idea to change implementation to
striped RW locks:

* busy state - we do tryLock() on random read lock
* blocked state - since this is very rare, we can acquire write locks on
each stripe.

Is this what you've meant?

--
Yakov Zhdanov, Director R&D
*GridGain Systems*
www.gridgain.com

2015-08-31 21:06 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> On Mon, Aug 31, 2015 at 11:03 AM, Vladimir Ozerov <[hidden email]>
> wrote:
>
> > Exactly, we use it primarily as "busy" lock, i.e. lots of concurrent
> > readers with writer blocking everything on node stop. But it doesn't
> > outperform regular ReentrantReadWriteLock actually.
> >
>
> I don't think this use-case is about performance. Yakov, can you jump in
> and remind us why we use the spin locks on Ignite instance stop?
>
>
> >
> > On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <
> [hidden email]>
> > wrote:
> >
> > > I don't recall exactly, but from what I remember, there were other
> > benefits
> > > to the spin-lock approach. Don't we use some characteristics of this
> lock
> > > to properly shut down the system?
> > >
> > > D.
> > >
> > > On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <[hidden email]
> >
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > We have two pretty strange constructs: GridSpinReadWriteLock and base
> > on
> > > it
> > > > GridSpinBusyLock.
> > > > As I understand it was an effort to create more performant RWLock
> than
> > > > ReentrantReadWriteLock
> > > > for cases when wrtie locks are very unlikely.
> > > >
> > > > As busy lock concept is also used in some sensitive places of
> > "platforms"
> > > > module, I measured performance of read lock-unlock cycles for both
> > > > ReentrantReadWriteLock
> > > > and GridSpinReadWriteLock using JMH.
> > > >
> > > > Our implementation doesn't offer any perform benefits comparing to
> > > > ReentrantReadWriteLock, their performance are almost equal. This
> makes
> > > > sense, because essentailly both locks just CASes on a shared variable
> > to
> > > > obtain the read lock. Looks like we can safely remove these
> "spinners"
> > > and
> > > > use ReentrantReadWriteLock instead.
> > > >
> > > > More serious perfomance gain can be achieved if we stripe the lock
> > (e.g.
> > > > like it is done in LongAdder) thus decreasing contention on shared
> > > > variables. Quick experiments shown 5x throughput increase for read
> > > > lock-unlock cycles when lock is striped.
> > > >
> > > > Thoughts?
> > > >
> > > > Vladimir.
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

Vladimir Ozerov
Yes, this is what I am talking about. Actually, for "busy" lock semantics
we need only 3 methods in regular RW lock terms: tryReadLock(),
readUnlock() and writeLock(). No need for reentrancy, write unlocks, etc..

This makes potential implementation very simple: read lock/unlock() methods
are simply atomic increment/decrement, and writeLock is a CAS-loop.

Then we stripe it using either random (as it is done in
Striped64/LongAdder) or using sequece numbers. The latter approach could be
better given that the vast majority of Ignite's internal logic happens in
our own thread pools. If we assign each thread from a pool different stipe
number, they will work without any contention at all.

On Mon, Aug 31, 2015 at 9:17 PM, Yakov Zhdanov <[hidden email]>
wrote:

> Vova, can you please share your benchmark? That was long time ago and I
> recall that I got better results in more complex benchmarks than jmh one.
>
> However, I am quiet excited with your idea to change implementation to
> striped RW locks:
>
> * busy state - we do tryLock() on random read lock
> * blocked state - since this is very rare, we can acquire write locks on
> each stripe.
>
> Is this what you've meant?
>
> --
> Yakov Zhdanov, Director R&D
> *GridGain Systems*
> www.gridgain.com
>
> 2015-08-31 21:06 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
> > On Mon, Aug 31, 2015 at 11:03 AM, Vladimir Ozerov <[hidden email]>
> > wrote:
> >
> > > Exactly, we use it primarily as "busy" lock, i.e. lots of concurrent
> > > readers with writer blocking everything on node stop. But it doesn't
> > > outperform regular ReentrantReadWriteLock actually.
> > >
> >
> > I don't think this use-case is about performance. Yakov, can you jump in
> > and remind us why we use the spin locks on Ignite instance stop?
> >
> >
> > >
> > > On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <
> > [hidden email]>
> > > wrote:
> > >
> > > > I don't recall exactly, but from what I remember, there were other
> > > benefits
> > > > to the spin-lock approach. Don't we use some characteristics of this
> > lock
> > > > to properly shut down the system?
> > > >
> > > > D.
> > > >
> > > > On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <
> [hidden email]
> > >
> > > > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > We have two pretty strange constructs: GridSpinReadWriteLock and
> base
> > > on
> > > > it
> > > > > GridSpinBusyLock.
> > > > > As I understand it was an effort to create more performant RWLock
> > than
> > > > > ReentrantReadWriteLock
> > > > > for cases when wrtie locks are very unlikely.
> > > > >
> > > > > As busy lock concept is also used in some sensitive places of
> > > "platforms"
> > > > > module, I measured performance of read lock-unlock cycles for both
> > > > > ReentrantReadWriteLock
> > > > > and GridSpinReadWriteLock using JMH.
> > > > >
> > > > > Our implementation doesn't offer any perform benefits comparing to
> > > > > ReentrantReadWriteLock, their performance are almost equal. This
> > makes
> > > > > sense, because essentailly both locks just CASes on a shared
> variable
> > > to
> > > > > obtain the read lock. Looks like we can safely remove these
> > "spinners"
> > > > and
> > > > > use ReentrantReadWriteLock instead.
> > > > >
> > > > > More serious perfomance gain can be achieved if we stripe the lock
> > > (e.g.
> > > > > like it is done in LongAdder) thus decreasing contention on shared
> > > > > variables. Quick experiments shown 5x throughput increase for read
> > > > > lock-unlock cycles when lock is striped.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Vladimir.
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

Yakov Zhdanov-2
Yes, I agree. Please file a ticket.

--
Yakov Zhdanov, Director R&D
*GridGain Systems*
www.gridgain.com

2015-08-31 21:29 GMT+03:00 Vladimir Ozerov <[hidden email]>:

> Yes, this is what I am talking about. Actually, for "busy" lock semantics
> we need only 3 methods in regular RW lock terms: tryReadLock(),
> readUnlock() and writeLock(). No need for reentrancy, write unlocks, etc..
>
> This makes potential implementation very simple: read lock/unlock() methods
> are simply atomic increment/decrement, and writeLock is a CAS-loop.
>
> Then we stripe it using either random (as it is done in
> Striped64/LongAdder) or using sequece numbers. The latter approach could be
> better given that the vast majority of Ignite's internal logic happens in
> our own thread pools. If we assign each thread from a pool different stipe
> number, they will work without any contention at all.
>
> On Mon, Aug 31, 2015 at 9:17 PM, Yakov Zhdanov <[hidden email]>
> wrote:
>
> > Vova, can you please share your benchmark? That was long time ago and I
> > recall that I got better results in more complex benchmarks than jmh one.
> >
> > However, I am quiet excited with your idea to change implementation to
> > striped RW locks:
> >
> > * busy state - we do tryLock() on random read lock
> > * blocked state - since this is very rare, we can acquire write locks on
> > each stripe.
> >
> > Is this what you've meant?
> >
> > --
> > Yakov Zhdanov, Director R&D
> > *GridGain Systems*
> > www.gridgain.com
> >
> > 2015-08-31 21:06 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >
> > > On Mon, Aug 31, 2015 at 11:03 AM, Vladimir Ozerov <
> [hidden email]>
> > > wrote:
> > >
> > > > Exactly, we use it primarily as "busy" lock, i.e. lots of concurrent
> > > > readers with writer blocking everything on node stop. But it doesn't
> > > > outperform regular ReentrantReadWriteLock actually.
> > > >
> > >
> > > I don't think this use-case is about performance. Yakov, can you jump
> in
> > > and remind us why we use the spin locks on Ignite instance stop?
> > >
> > >
> > > >
> > > > On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <
> > > [hidden email]>
> > > > wrote:
> > > >
> > > > > I don't recall exactly, but from what I remember, there were other
> > > > benefits
> > > > > to the spin-lock approach. Don't we use some characteristics of
> this
> > > lock
> > > > > to properly shut down the system?
> > > > >
> > > > > D.
> > > > >
> > > > > On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <
> > [hidden email]
> > > >
> > > > > wrote:
> > > > >
> > > > > > Igniters,
> > > > > >
> > > > > > We have two pretty strange constructs: GridSpinReadWriteLock and
> > base
> > > > on
> > > > > it
> > > > > > GridSpinBusyLock.
> > > > > > As I understand it was an effort to create more performant RWLock
> > > than
> > > > > > ReentrantReadWriteLock
> > > > > > for cases when wrtie locks are very unlikely.
> > > > > >
> > > > > > As busy lock concept is also used in some sensitive places of
> > > > "platforms"
> > > > > > module, I measured performance of read lock-unlock cycles for
> both
> > > > > > ReentrantReadWriteLock
> > > > > > and GridSpinReadWriteLock using JMH.
> > > > > >
> > > > > > Our implementation doesn't offer any perform benefits comparing
> to
> > > > > > ReentrantReadWriteLock, their performance are almost equal. This
> > > makes
> > > > > > sense, because essentailly both locks just CASes on a shared
> > variable
> > > > to
> > > > > > obtain the read lock. Looks like we can safely remove these
> > > "spinners"
> > > > > and
> > > > > > use ReentrantReadWriteLock instead.
> > > > > >
> > > > > > More serious perfomance gain can be achieved if we stripe the
> lock
> > > > (e.g.
> > > > > > like it is done in LongAdder) thus decreasing contention on
> shared
> > > > > > variables. Quick experiments shown 5x throughput increase for
> read
> > > > > > lock-unlock cycles when lock is striped.
> > > > > >
> > > > > > Thoughts?
> > > > > >
> > > > > > Vladimir.
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: GridSpinBusyLock performance

Vladimir Ozerov
I implemented the lock as discussed. Appreciate if someone review it -
org.apache.ignite.internal.util.GridStripedSpinBusyLock.
Corresponding ticket - https://issues.apache.org/jira/browse/IGNITE-1346

On Mon, Aug 31, 2015 at 9:32 PM, Yakov Zhdanov <[hidden email]>
wrote:

> Yes, I agree. Please file a ticket.
>
> --
> Yakov Zhdanov, Director R&D
> *GridGain Systems*
> www.gridgain.com
>
> 2015-08-31 21:29 GMT+03:00 Vladimir Ozerov <[hidden email]>:
>
> > Yes, this is what I am talking about. Actually, for "busy" lock semantics
> > we need only 3 methods in regular RW lock terms: tryReadLock(),
> > readUnlock() and writeLock(). No need for reentrancy, write unlocks,
> etc..
> >
> > This makes potential implementation very simple: read lock/unlock()
> methods
> > are simply atomic increment/decrement, and writeLock is a CAS-loop.
> >
> > Then we stripe it using either random (as it is done in
> > Striped64/LongAdder) or using sequece numbers. The latter approach could
> be
> > better given that the vast majority of Ignite's internal logic happens in
> > our own thread pools. If we assign each thread from a pool different
> stipe
> > number, they will work without any contention at all.
> >
> > On Mon, Aug 31, 2015 at 9:17 PM, Yakov Zhdanov <[hidden email]>
> > wrote:
> >
> > > Vova, can you please share your benchmark? That was long time ago and I
> > > recall that I got better results in more complex benchmarks than jmh
> one.
> > >
> > > However, I am quiet excited with your idea to change implementation to
> > > striped RW locks:
> > >
> > > * busy state - we do tryLock() on random read lock
> > > * blocked state - since this is very rare, we can acquire write locks
> on
> > > each stripe.
> > >
> > > Is this what you've meant?
> > >
> > > --
> > > Yakov Zhdanov, Director R&D
> > > *GridGain Systems*
> > > www.gridgain.com
> > >
> > > 2015-08-31 21:06 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >
> > > > On Mon, Aug 31, 2015 at 11:03 AM, Vladimir Ozerov <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Exactly, we use it primarily as "busy" lock, i.e. lots of
> concurrent
> > > > > readers with writer blocking everything on node stop. But it
> doesn't
> > > > > outperform regular ReentrantReadWriteLock actually.
> > > > >
> > > >
> > > > I don't think this use-case is about performance. Yakov, can you jump
> > in
> > > > and remind us why we use the spin locks on Ignite instance stop?
> > > >
> > > >
> > > > >
> > > > > On Mon, Aug 31, 2015 at 6:41 PM, Dmitriy Setrakyan <
> > > > [hidden email]>
> > > > > wrote:
> > > > >
> > > > > > I don't recall exactly, but from what I remember, there were
> other
> > > > > benefits
> > > > > > to the spin-lock approach. Don't we use some characteristics of
> > this
> > > > lock
> > > > > > to properly shut down the system?
> > > > > >
> > > > > > D.
> > > > > >
> > > > > > On Mon, Aug 31, 2015 at 5:24 AM, Vladimir Ozerov <
> > > [hidden email]
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Igniters,
> > > > > > >
> > > > > > > We have two pretty strange constructs: GridSpinReadWriteLock
> and
> > > base
> > > > > on
> > > > > > it
> > > > > > > GridSpinBusyLock.
> > > > > > > As I understand it was an effort to create more performant
> RWLock
> > > > than
> > > > > > > ReentrantReadWriteLock
> > > > > > > for cases when wrtie locks are very unlikely.
> > > > > > >
> > > > > > > As busy lock concept is also used in some sensitive places of
> > > > > "platforms"
> > > > > > > module, I measured performance of read lock-unlock cycles for
> > both
> > > > > > > ReentrantReadWriteLock
> > > > > > > and GridSpinReadWriteLock using JMH.
> > > > > > >
> > > > > > > Our implementation doesn't offer any perform benefits comparing
> > to
> > > > > > > ReentrantReadWriteLock, their performance are almost equal.
> This
> > > > makes
> > > > > > > sense, because essentailly both locks just CASes on a shared
> > > variable
> > > > > to
> > > > > > > obtain the read lock. Looks like we can safely remove these
> > > > "spinners"
> > > > > > and
> > > > > > > use ReentrantReadWriteLock instead.
> > > > > > >
> > > > > > > More serious perfomance gain can be achieved if we stripe the
> > lock
> > > > > (e.g.
> > > > > > > like it is done in LongAdder) thus decreasing contention on
> > shared
> > > > > > > variables. Quick experiments shown 5x throughput increase for
> > read
> > > > > > > lock-unlock cycles when lock is striped.
> > > > > > >
> > > > > > > Thoughts?
> > > > > > >
> > > > > > > Vladimir.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>