Hi Guys
Alexei Scherbakov report a ticket few time ago [1]. The solution look promissing. Alexei, you wrote that this can save some memory. More over replacing linked Set structure to array based bit-set can give a speed-up due to array based structures are cache friendly. But one thing is not clear for me how we will handle sparsed bit-sets? For example, if we have 1024 partiotions (as it is by default) and have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes that seem to be more than Set<Integer>. What do you mean exactly to use bit-set as more compact structure then Set<Integer> or bit-set with some additional compression? I would thought, we can use hash-set with open addressing in some cases like that to get gain of array bases structures over linked structures and save memory? For example, we could use such hash-set for small data (64bytes as cache line size) and use bit-sets for bigger data, if it's possible of course. Thoughts? [1] https://issues.apache.org/jira/browse/IGNITE-4554 |
I'd suggest to have a single abstract class `Partitions` with protected
constructor and static factory method. This will allow to add different optimized for any particular case implementations transparently. Sergi 2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <[hidden email]>: > Hi Guys > > Alexei Scherbakov report a ticket few time ago [1]. The solution look > promissing. > > Alexei, you wrote that this can save some memory. More over replacing > linked Set structure to array based bit-set > can give a speed-up due to array based structures are cache friendly. > > But one thing is not clear for me how we will handle sparsed bit-sets? For > example, if we have 1024 partiotions (as it is by default) > and have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes > that seem to be more than Set<Integer>. > > What do you mean exactly to use bit-set as more compact structure then > Set<Integer> or bit-set with some additional compression? > > I would thought, we can use hash-set with open addressing in some cases > like that to get gain of array bases structures over linked structures and > save memory? > For example, we could use such hash-set for small data (64bytes as cache > line size) and use bit-sets for bigger data, if it's possible of course. > > > Thoughts? > > [1] https://issues.apache.org/jira/browse/IGNITE-4554 > |
Andrey,
IGNITE-4554 is about compressed bit sets, not standart bit sets. Currently I'm implementing data structure based on [1] It should efficiently handle sparse bit sets. [1] http://roaringbitmap.org/ 2017-01-21 16:28 GMT+03:00 Sergi Vladykin <[hidden email]>: > I'd suggest to have a single abstract class `Partitions` with protected > constructor and static factory method. This will allow to add different > optimized for any particular case implementations transparently. > > Sergi > > 2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <[hidden email]>: > > > Hi Guys > > > > Alexei Scherbakov report a ticket few time ago [1]. The solution look > > promissing. > > > > Alexei, you wrote that this can save some memory. More over replacing > > linked Set structure to array based bit-set > > can give a speed-up due to array based structures are cache friendly. > > > > But one thing is not clear for me how we will handle sparsed bit-sets? > For > > example, if we have 1024 partiotions (as it is by default) > > and have much nodes, e.g. 512. In this case, bit-set will occupy 256 > bytes > > that seem to be more than Set<Integer>. > > > > What do you mean exactly to use bit-set as more compact structure then > > Set<Integer> or bit-set with some additional compression? > > > > I would thought, we can use hash-set with open addressing in some cases > > like that to get gain of array bases structures over linked structures > and > > save memory? > > For example, we could use such hash-set for small data (64bytes as cache > > line size) and use bit-sets for bigger data, if it's possible of course. > > > > > > Thoughts? > > > > [1] https://issues.apache.org/jira/browse/IGNITE-4554 > > > -- Best regards, Alexei Scherbakov |
In reply to this post by Sergi
Try this https://github.com/RoaringBitmap/RoaringBitmap
-----邮件原件----- 发件人: Sergi Vladykin [mailto:[hidden email]] 发送时间: 2017年1月21日 21:29 收件人: [hidden email] 主题: Re: Optimize integer sets. I'd suggest to have a single abstract class `Partitions` with protected constructor and static factory method. This will allow to add different optimized for any particular case implementations transparently. Sergi 2017-01-21 15:26 GMT+03:00 Andrey Mashenkov <[hidden email]>: > Hi Guys > > Alexei Scherbakov report a ticket few time ago [1]. The solution look > promissing. > > Alexei, you wrote that this can save some memory. More over replacing > linked Set structure to array based bit-set can give a speed-up due to > array based structures are cache friendly. > > But one thing is not clear for me how we will handle sparsed bit-sets? > For example, if we have 1024 partiotions (as it is by default) and > have much nodes, e.g. 512. In this case, bit-set will occupy 256 bytes > that seem to be more than Set<Integer>. > > What do you mean exactly to use bit-set as more compact structure then > Set<Integer> or bit-set with some additional compression? > > I would thought, we can use hash-set with open addressing in some > cases like that to get gain of array bases structures over linked > structures and save memory? > For example, we could use such hash-set for small data (64bytes as > cache line size) and use bit-sets for bigger data, if it's possible of course. > > > Thoughts? > > [1] https://issues.apache.org/jira/browse/IGNITE-4554 > |
Free forum by Nabble | Edit this page |