Hi Igniters,
It seems that it's not possible to implement effective leader-board with the current Ignite indexes. Leader-board stores a score and an id of a player of some game. Score is indexed. One of the possible requests to that data structure is to get some range of scores based on their rank (it's effectively a number of a row). I suppose it's required for pagination. For example Redis has an index (https://redis.io/topics/data-types#sorted-sets) that can be scanned (https://redis.io/commands/zrange) to a particular line with O(log(n)) complexity. As far as I know it's node-local. Correct me if I'm wrong but in Ignite we scan an index until we find a row corresponding to a limit/offset clause. It looks like a linear complexity. I suppose it could be possible to implement it for REPLICATED caches and local queries. But it's really difficult for me to estimate the efforts. By the way it would be useful for analytical tools, most of them paginate. So what do you think is it possible to make that happen in Ignite and do we need it at all? Ignite 3.0 maybe? -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Hi Vladimir,
Usually, LSM-storage engines serve the range searches the best (Cassandra is one of the examples). The SortedSet of Redis is one of the typical components you can find in LSM-engines. Presently, Ignite neither supports an LSM store nor a SortedSet data structure. However, the range searches with Ignite still have O(log(n)) complexity, it just takes more steps for the B-tree to collect all the records of a range. - Denis On Fri, Sep 4, 2020 at 3:26 AM Vladimir Pligin <[hidden email]> wrote: > Hi Igniters, > > It seems that it's not possible to implement effective leader-board with > the > current Ignite indexes. > Leader-board stores a score and an id of a player of some game. Score is > indexed. One of the possible requests to that data structure is to get some > range of scores based on their rank (it's effectively a number of a row). I > suppose it's required for pagination. For example Redis has an index > (https://redis.io/topics/data-types#sorted-sets) that can be scanned > (https://redis.io/commands/zrange) to a particular line with O(log(n)) > complexity. As far as I know it's node-local. Correct me if I'm wrong but > in > Ignite we scan an index until we find a row corresponding to a limit/offset > clause. It looks like a linear complexity. I suppose it could be possible > to > implement it for REPLICATED caches and local queries. But it's really > difficult for me to estimate the efforts. By the way it would be useful for > analytical tools, most of them paginate. So what do you think is it > possible > to make that happen in Ignite and do we need it at all? Ignite 3.0 maybe? > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
Hi Vladimir,
As far as I know, Redis' SortedSet is not shared. Is this correct? Is it even possible to support similar functionality for a partitioned dataset? -Val On Fri, Sep 4, 2020 at 8:58 AM Denis Magda <[hidden email]> wrote: > Hi Vladimir, > > Usually, LSM-storage engines serve the range searches the best (Cassandra > is one of the examples). The SortedSet of Redis is one of the typical > components you can find in LSM-engines. > > Presently, Ignite neither supports an LSM store nor a SortedSet data > structure. However, the range searches with Ignite still have O(log(n)) > complexity, it just takes more steps for the B-tree to collect all the > records of a range. > > - > Denis > > > On Fri, Sep 4, 2020 at 3:26 AM Vladimir Pligin <[hidden email]> > wrote: > > > Hi Igniters, > > > > It seems that it's not possible to implement effective leader-board with > > the > > current Ignite indexes. > > Leader-board stores a score and an id of a player of some game. Score is > > indexed. One of the possible requests to that data structure is to get > some > > range of scores based on their rank (it's effectively a number of a > row). I > > suppose it's required for pagination. For example Redis has an index > > (https://redis.io/topics/data-types#sorted-sets) that can be scanned > > (https://redis.io/commands/zrange) to a particular line with O(log(n)) > > complexity. As far as I know it's node-local. Correct me if I'm wrong but > > in > > Ignite we scan an index until we find a row corresponding to a > limit/offset > > clause. It looks like a linear complexity. I suppose it could be possible > > to > > implement it for REPLICATED caches and local queries. But it's really > > difficult for me to estimate the efforts. By the way it would be useful > for > > analytical tools, most of them paginate. So what do you think is it > > possible > > to make that happen in Ignite and do we need it at all? Ignite 3.0 maybe? > > > > > > > > -- > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > > > |
Folks, please disregard my previous email. I confused the range search task
when (the search boundaries are defined in a query) with the K largest elements problem described by Vladimir. Personally, I don't think there is a practical need for partitioned sorted sets. Even if you store thousands of scores in a sorted set it will fit and perform nicely on a single node. If a read rate is high, then you can keep a full copy of the set on 2, 3, etc. nodes and load balance reads among the replicas. It should not be challenging to support such a data structure in Ignite. Rough idea: - You store a sorted set by a unique key in our system cache (or whenever we keep other data structures). - Configure a number of full replicas for the set across the cluster (if you want to load balance reads). - Updates: it might be too expensive to use Java SortedSet as the actual data structure on the nodes due to frequent serialization<->deserialization on updates. We can think of our version. - Reads: we need to figure out how to load balance reads across replicas. A naive approach can use some round-robin algorithm on the client-side. For instance, imagine we introduced IgniteSortedSet API and the client calls IgniteSortedSet.range(start, end) method. Internally, IgniteSortedSet implementation can maintain a variable that keeps an id of a server node whose turn is to serve the next request; the variable gets updated in the round-robin fashion. Btw, Redis SortedSet is not scalable or partitioned, and that's not a big deal: https://stackoverflow.com/questions/6948492/will-rediss-sorted-sets-scale - Denis On Fri, Sep 4, 2020 at 2:36 PM Valentin Kulichenko < [hidden email]> wrote: > Hi Vladimir, > > As far as I know, Redis' SortedSet is not shared. Is this correct? Is it > even possible to support similar functionality for a partitioned dataset? > > -Val > > On Fri, Sep 4, 2020 at 8:58 AM Denis Magda <[hidden email]> wrote: > > > Hi Vladimir, > > > > Usually, LSM-storage engines serve the range searches the best (Cassandra > > is one of the examples). The SortedSet of Redis is one of the typical > > components you can find in LSM-engines. > > > > Presently, Ignite neither supports an LSM store nor a SortedSet data > > structure. However, the range searches with Ignite still have O(log(n)) > > complexity, it just takes more steps for the B-tree to collect all the > > records of a range. > > > > - > > Denis > > > > > > On Fri, Sep 4, 2020 at 3:26 AM Vladimir Pligin <[hidden email]> > > wrote: > > > > > Hi Igniters, > > > > > > It seems that it's not possible to implement effective leader-board > with > > > the > > > current Ignite indexes. > > > Leader-board stores a score and an id of a player of some game. Score > is > > > indexed. One of the possible requests to that data structure is to get > > some > > > range of scores based on their rank (it's effectively a number of a > > row). I > > > suppose it's required for pagination. For example Redis has an index > > > (https://redis.io/topics/data-types#sorted-sets) that can be scanned > > > (https://redis.io/commands/zrange) to a particular line with O(log(n)) > > > complexity. As far as I know it's node-local. Correct me if I'm wrong > but > > > in > > > Ignite we scan an index until we find a row corresponding to a > > limit/offset > > > clause. It looks like a linear complexity. I suppose it could be > possible > > > to > > > implement it for REPLICATED caches and local queries. But it's really > > > difficult for me to estimate the efforts. By the way it would be useful > > for > > > analytical tools, most of them paginate. So what do you think is it > > > possible > > > to make that happen in Ignite and do we need it at all? Ignite 3.0 > maybe? > > > > > > > > > > > > -- > > > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > > > > > > |
Denis,
I absolutely agree with you that a SortedSet doesn't need to be partitioned. The main idea I wanted to put attention on is that now it's not possible to find a row with a particular number with log complexity. We need to scan node-local index until we reach some offset, it seems to be linear. For example Redis uses a combination of a hash table and skip list to achieve that. I'm not aware how complicated it is to implement it in Ignite as we use trees. Just wanted to draw some attention to it. -- Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ |
Vladimir,
What do you think about my high-level solution shared in the previous email? What if we implement the SortedSet they way I suggested? - Denis On Tue, Sep 8, 2020 at 4:15 AM Vladimir Pligin <[hidden email]> wrote: > Denis, > > I absolutely agree with you that a SortedSet doesn't need to be > partitioned. > The main idea I wanted to put attention on is that now it's not possible to > find a row with a particular number with log complexity. We need to scan > node-local index until we reach some offset, it seems to be linear. For > example Redis uses a combination of a hash table and skip list to achieve > that. I'm not aware how complicated it is to implement it in Ignite as we > use trees. Just wanted to draw some attention to it. > > > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > |
Free forum by Nabble | Edit this page |