Folks,
I pushed an initial implementation of IGNITE-3477 to ignite-3477 branch for community review and further discussion. Note that the implementation lacks the following features: - On-heap deserialized values cache - Full LOCAL cache support - Eviction policies - Multiple memory pools - Distributed joins support - Off-heap circular remove buffer - Maybe something else I missed The subject of this discussion is to determine whether the PageMemory approach is a way to go, because the this implementation is almost 2x slower than current 2.0 branch. There is some room for improvement, but I am not completely sure we can gain the same performance numbers as in 2.0. I encourage the community to review the code and architecture and share their thoughts here. Thanks, AG |
I agree, we should fix all the outstanding issues and resolve the
performance problems before merging it into 2.0 Sergi 2016-12-29 12:37 GMT+03:00 Alexey Goncharuk <[hidden email]>: > Folks, > > I pushed an initial implementation of IGNITE-3477 to ignite-3477 branch for > community review and further discussion. > > Note that the implementation lacks the following features: > - On-heap deserialized values cache > - Full LOCAL cache support > - Eviction policies > - Multiple memory pools > - Distributed joins support > - Off-heap circular remove buffer > - Maybe something else I missed > > The subject of this discussion is to determine whether the PageMemory > approach is a way to go, because the this implementation is almost 2x > slower than current 2.0 branch. There is some room for improvement, but I > am not completely sure we can gain the same performance numbers as in 2.0. > > I encourage the community to review the code and architecture and share > their thoughts here. > > Thanks, > AG > |
In reply to this post by Alexey Goncharuk
Alex,
> On Dec 29, 2016, at 1:37 AM, Alexey Goncharuk <[hidden email]> wrote: > > The subject of this discussion is to determine whether the PageMemory > approach is a way to go, because the this implementation is almost 2x > slower than current 2.0 branch. What is the main reason for that? Some architecture flaw you didn’t consider before or performance issues that have to be cleaned up? — Denis |
Denis, this is not "dreaming". We can't have a release with less features than 1.8. We can only add features, not subtract.
Dmitriy > On Dec 29, 2016, at 9:49 AM, Denis Magda <[hidden email]> wrote: > > Alex, > >> On Dec 29, 2016, at 1:37 AM, Alexey Goncharuk <[hidden email]> wrote: >> >> The subject of this discussion is to determine whether the PageMemory >> approach is a way to go, because the this implementation is almost 2x >> slower than current 2.0 branch. > > What is the main reason for that? Some architecture flaw you didn’t consider before or performance issues that have to be cleaned up? > > — > Denis |
In reply to this post by Alexey Goncharuk
On Thu, Dec 29, 2016 at 1:37 AM, Alexey Goncharuk <
[hidden email]> wrote: > Folks, > > I pushed an initial implementation of IGNITE-3477 to ignite-3477 branch for > community review and further discussion. > > Note that the implementation lacks the following features: > - On-heap deserialized values cache > - Full LOCAL cache support > - Eviction policies > - Multiple memory pools > - Distributed joins support > - Off-heap circular remove buffer > - Maybe something else I missed > Do we have *blocker* tickets for all the remaining issues? Ignite 2.0 will have to support everything in Ignite 1.0. Otherwise we will not be able to release. > The subject of this discussion is to determine whether the PageMemory > approach is a way to go, because the this implementation is almost 2x > slower than current 2.0 branch. There is some room for improvement, but I > am not completely sure we can gain the same performance numbers as in 2.0. > I would rephrase this. We should all assume that the PageMemory approach is the right approach. Here are the main benefits: - completely off-heap (minimal GC overhead) - predictable memory size - ability to extend to external store, like disk, without serialization - etc... Let's collectively work on ensuring that it can perform as fast as Ignite 1.8.x. If after a thorough investigation we decide that PageMemory cannot perform, then we can start thinking about other approaches. > I encourage the community to review the code and architecture and share > their thoughts here. > Completely agree. If anyone has extra cycles, please review the code and suggest any improvements. |
So are you, guys, suggesting to accept page-memory as the right one by
default, which: 1) Doesn't work with half of current cache features 2) Halved our performance 3) Goes against the whole Java ecosystem with it's "offheap-first" approach? Good try, but no :-) Let me clarify on p.3. Offheap-first is not correct approach. It is questionable and dangerous way, to say the least. GC is central component of the whole Java ecosystem. No wonder that the most comfortable approach for users is when everythings is stored in heap. Offheap solutions were created to mitigate scalability issues Java faced in recent years due to rapid decrease in RAM costs. However, it doesn't mean that things will be bad forever. At the moment there are at least 3 modern GCs targeting scalability: G1GC from Oracle, Shenandoah from RedHat, and C4 from Azul. No doubts they will solve (or at least relieve signifiacntly) the problem in mid-term, with gradual improvement from year-to-year, month-to-month. Moreover, GC problem is attacked from different angles. Another major improvement is stack-based structures which is going to appear in Java as a part of Valhalla project [1]. When implemented, frameworks will be able to reduce heap allocations significantly. Instead of having several dozens heap objecs rooted from our infamous GridCacheMapEntry, we will have only one heap object - GridCacheMapEntry itself. Okay, okay, this is a matter of years, we need solution now, what is wrong with offheap? Only one thing - it *splits server memory into two unrelated pieces* - Java heap and offheap. This is terrible thing from user perspective. I already went through this during Hadoop Accelerator development: - Output data is stored offheap. Cool, no GC! - Intermediate data, such as our NIO messages, are stored in Java heap. Now we run intensive load and ... OutOfMemoryError!.Ok, giving more Java heap, but now ... out of native memory! Finally, in order to make it work we have to give much more memory than needed to one of these parts. Result: *poor memory utilization*. Things would be much more easier if we either store everything in heap, or everything offheap. But as user code is executed in heap by default, offheap is not an option for average user. All in all, offheap approach is valuable for high-end deployments with hundreds gigabytes of memory. But on commodity software with moderate amount of memory applications are likely to have problems due to heap/offheap separation, without any advantages. So my main concern is *what about current heap mode*? It must stay alive. Page-memory approach should be abstracted out and implemented in addition to current heap-approach, not instead of it. Have high-end machine and suffer from GC? Pick offheap mode. Have a commodity machine? Old good heap mode is your choice. [1] http://openjdk.java.net/projects/valhalla/ On Fri, Dec 30, 2016 at 9:50 PM, Dmitriy Setrakyan <[hidden email]> wrote: > On Thu, Dec 29, 2016 at 1:37 AM, Alexey Goncharuk < > [hidden email]> wrote: > > > Folks, > > > > I pushed an initial implementation of IGNITE-3477 to ignite-3477 branch > for > > community review and further discussion. > > > > Note that the implementation lacks the following features: > > - On-heap deserialized values cache > > - Full LOCAL cache support > > - Eviction policies > > - Multiple memory pools > > - Distributed joins support > > - Off-heap circular remove buffer > > - Maybe something else I missed > > > > Do we have *blocker* tickets for all the remaining issues? Ignite 2.0 will > have to support everything in Ignite 1.0. Otherwise we will not be able to > release. > > > > The subject of this discussion is to determine whether the PageMemory > > approach is a way to go, because the this implementation is almost 2x > > slower than current 2.0 branch. There is some room for improvement, but I > > am not completely sure we can gain the same performance numbers as in > 2.0. > > > > I would rephrase this. We should all assume that the PageMemory approach is > the right approach. Here are the main benefits: > > - completely off-heap (minimal GC overhead) > - predictable memory size > - ability to extend to external store, like disk, without serialization > - etc... > > Let's collectively work on ensuring that it can perform as fast as Ignite > 1.8.x. If after a thorough investigation we decide that PageMemory cannot > perform, then we can start thinking about other approaches. > > > > I encourage the community to review the code and architecture and share > > their thoughts here. > > > > Completely agree. If anyone has extra cycles, please review the code and > suggest any improvements. > |
On Sat, Dec 31, 2016 at 7:07 AM, Vladimir Ozerov <[hidden email]>
wrote: So my main concern is *what about current heap mode*? It must stay alive. > Page-memory approach should be abstracted out and implemented in addition > to current heap-approach, not instead of it. Have high-end machine and > suffer from GC? Pick offheap mode. Have a commodity machine? Old good heap > mode is your choice. > Vova, disagree. I don't see a reason to maintain on-heap implementation, if we can make the off-heap one work fast enough. Remember, this is the 1st draft only. Once we optimize it, it will get a lot faster. |
Here we need to define what’s meant under *fast enough*. Java is unmanageable in terms of memory and it’s unlikely that any custom memory manageable solution like the PageMemory will outperform it ever. Simply because the Java heap will still be an intermediate layer between an application and the PageMemory passing objects back and forth. Also the PageMemory manages data, as I understand, with JNI-based Unsafe that also brings performance hit, etc.
So, personally I share Vladimir’s point of view and would not discontinue the on-heap implementation. What I can’t is why the PageMemory so slower than the current *off*-heap based implementation? The latter has comparable performance with the on-heap impl and it’s not twice times slower for sure. Alex G., could you elaborate on this? — Denis > On Dec 31, 2016, at 8:29 AM, Dmitriy Setrakyan <[hidden email]> wrote: > > On Sat, Dec 31, 2016 at 7:07 AM, Vladimir Ozerov <[hidden email]> > wrote: > > So my main concern is *what about current heap mode*? It must stay alive. >> Page-memory approach should be abstracted out and implemented in addition >> to current heap-approach, not instead of it. Have high-end machine and >> suffer from GC? Pick offheap mode. Have a commodity machine? Old good heap >> mode is your choice. >> > > Vova, disagree. I don't see a reason to maintain on-heap implementation, if > we can make the off-heap one work fast enough. Remember, this is the 1st > draft only. Once we optimize it, it will get a lot faster. |
Sorry, just recalled that Unsafe is not JNI based. However, my previous point of view still remains the same.
> On Dec 31, 2016, at 11:39 PM, Denis Magda <[hidden email]> wrote: > > JNI-based Unsafe that also brings performance hit — Denis |
Dima,
Performance is a serious concern, but not the main one. My point is that standard users working on commodity hardware and requiring only in-memory mode simply do not need page memory. They need distributed HashMap. We already have it. It is fast, it is stable, it have been tested rigorously for years. It does what users need. PageMemory approach targets high-end deployments which is hardly represents majority of our users. Less than 10% I think. Or may be <5%, or even <1%. This is who may benefit from page memory. Others will benefit nothing except of additional layer of indirection, drop in performance, risks of instability. And problems with capacity planning, because it is much harder to plan two memory regions properly than a single one. I talked to Alexey Goncharuk some time ago, and he told it is not a big deal to abstract out PageMemory. Alex, please confirm. I encourage everyone to stop thinking of dropping "old" before you have built "new" and confirmed that it is better. Let's ensure that new approach is well-abstracted, add it to 2.0, let it maturate for 1-2 years, and then think of dropping current approach in 3.0. This sounds much better to me. On Sun, Jan 1, 2017 at 10:42 AM, Denis Magda <[hidden email]> wrote: > Sorry, just recalled that Unsafe is not JNI based. However, my previous > point of view still remains the same. > > > On Dec 31, 2016, at 11:39 PM, Denis Magda <[hidden email]> wrote: > > > > JNI-based Unsafe that also brings performance hit > > — > Denis > > |
Vova,
I would qualify the need for PageMemory as strategic for Apache Ignite. With addition of SQL Grid component, Ignite can now also satisfy in-memory database use cases, which are very space consuming and require a new memory management approach. Basic distributed hash map is not going to work for such use cases. Once PageMemory becomes stable and fast, I don't believe we can afford to maintain two types of memory management in the project. It will be just too time consuming. On top of that, the old approach will simply not be needed anymore. I am OK with maintaining 2 implementations, assuming we can have a good abstraction for the memory management.However, even in that case, we should all weigh whether it makes sense to spend time on migrating the current on-heap implementation to use these new memory APIs. D. On Sun, Jan 1, 2017 at 10:47 PM, Vladimir Ozerov <[hidden email]> wrote: > Dima, > > Performance is a serious concern, but not the main one. My point is that > standard users working on commodity hardware and requiring only in-memory > mode simply do not need page memory. They need distributed HashMap. We > already have it. It is fast, it is stable, it have been tested rigorously > for years. It does what users need. > > PageMemory approach targets high-end deployments which is hardly represents > majority of our users. Less than 10% I think. Or may be <5%, or even <1%. > This is who may benefit from page memory. Others will benefit nothing > except of additional layer of indirection, drop in performance, risks of > instability. And problems with capacity planning, because it is much harder > to plan two memory regions properly than a single one. > > I talked to Alexey Goncharuk some time ago, and he told it is not a big > deal to abstract out PageMemory. Alex, please confirm. I encourage everyone > to stop thinking of dropping "old" before you have built "new" and > confirmed that it is better. > > Let's ensure that new approach is well-abstracted, add it to 2.0, let it > maturate for 1-2 years, and then think of dropping current approach in 3.0. > This sounds much better to me. > > > On Sun, Jan 1, 2017 at 10:42 AM, Denis Magda <[hidden email]> wrote: > > > Sorry, just recalled that Unsafe is not JNI based. However, my previous > > point of view still remains the same. > > > > > On Dec 31, 2016, at 11:39 PM, Denis Magda <[hidden email]> wrote: > > > > > > JNI-based Unsafe that also brings performance hit > > > > — > > Denis > > > > > |
Agree with Dmitriy. We should avoid having multiple implementations of the
same thing if possible. Lets put our efforts on fixing issues with PageMemory. Sergi 2017-01-03 20:11 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > Vova, > > I would qualify the need for PageMemory as strategic for Apache Ignite. > With addition of SQL Grid component, Ignite can now also satisfy in-memory > database use cases, which are very space consuming and require a new memory > management approach. Basic distributed hash map is not going to work for > such use cases. > > Once PageMemory becomes stable and fast, I don't believe we can afford to > maintain two types of memory management in the project. It will be just too > time consuming. On top of that, the old approach will simply not be needed > anymore. > > I am OK with maintaining 2 implementations, assuming we can have a good > abstraction for the memory management.However, even in that case, we should > all weigh whether it makes sense to spend time on migrating the current > on-heap implementation to use these new memory APIs. > > D. > > On Sun, Jan 1, 2017 at 10:47 PM, Vladimir Ozerov <[hidden email]> > wrote: > > > Dima, > > > > Performance is a serious concern, but not the main one. My point is that > > standard users working on commodity hardware and requiring only in-memory > > mode simply do not need page memory. They need distributed HashMap. We > > already have it. It is fast, it is stable, it have been tested rigorously > > for years. It does what users need. > > > > PageMemory approach targets high-end deployments which is hardly > represents > > majority of our users. Less than 10% I think. Or may be <5%, or even <1%. > > This is who may benefit from page memory. Others will benefit nothing > > except of additional layer of indirection, drop in performance, risks of > > instability. And problems with capacity planning, because it is much > harder > > to plan two memory regions properly than a single one. > > > > I talked to Alexey Goncharuk some time ago, and he told it is not a big > > deal to abstract out PageMemory. Alex, please confirm. I encourage > everyone > > to stop thinking of dropping "old" before you have built "new" and > > confirmed that it is better. > > > > Let's ensure that new approach is well-abstracted, add it to 2.0, let it > > maturate for 1-2 years, and then think of dropping current approach in > 3.0. > > This sounds much better to me. > > > > > > On Sun, Jan 1, 2017 at 10:42 AM, Denis Magda <[hidden email]> wrote: > > > > > Sorry, just recalled that Unsafe is not JNI based. However, my previous > > > point of view still remains the same. > > > > > > > On Dec 31, 2016, at 11:39 PM, Denis Magda <[hidden email]> wrote: > > > > > > > > JNI-based Unsafe that also brings performance hit > > > > > > — > > > Denis > > > > > > > > > |
Free forum by Nabble | Edit this page |