Hi, Igniters!
Current implementation of IgniteSet is fragile with respect to cluster recovery from a checkpoint. We have an issue (IGNITE-5553) that addresses set's size() behavior, but the problem is slightly broader. The text below is my comment from Jira issue. I encourage you to discuss it. We can put current set size into set header cache entry. This will fix size(), but we have broken iterator() implementation as well. Currently, set implementation maintains plain Java sets on every node, see CacheDataStructuresManager.setDataMap. These sets duplicate backing-cache entries, both primary and backup. size() and iterator() calls issue distributed queries to collect/filter data from all setDataMap's. And setDataMaps remain empty after cluster is recovered from checkpoint. Now I see the following options to fix the issue. #1 - Naive. Iterate over all datastructure-backing caches entries during recover from checkpoint procedure, filter set-related entries and refill setDataMap's. Pros: easy to implement Cons: inpredictable time/memory overhead. #2 - More realistic. Avoid node-local copies of cache data. Maintain linked list in datastructure-backing cache: key is set item, value is next set item. List head is stored in set header cache entry (this set item is youngest one). Iterators build on top of this structure are fail-fast. Pros: less memory overhead, no need to maintain node-local mirrors of cache data Cons: iterators are not fail-safe. #3 - Option #2 modified. We can store reference counter and 'removed' flag along with next item reference. This allows to make iterators fail safe. Pros: iterators are fail-safe Cons: slightly more complicated implementation, may affect performance, also I see no way to handle active iterators on remote nodes failures. Best regards, Andrey. |
Hi Andrey,
Thanks for a detailed email. I think your suggestions do make sense. Ignite cannot afford to have a distributed set that is not fail-safe. Can you please focus only on solutions that provide consistent behavior in case of topology changes and failures and document them in the ticket? https://issues.apache.org/jira/browse/IGNITE-5553 D. On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <[hidden email]> wrote: > Hi, Igniters! > > Current implementation of IgniteSet is fragile with respect to cluster > recovery from a checkpoint. We have an issue (IGNITE-5553) that addresses > set's size() behavior, but the problem is slightly broader. The text below > is my comment from Jira issue. I encourage you to discuss it. > > We can put current set size into set header cache entry. This will fix > size(), but we have broken iterator() implementation as well. > > Currently, set implementation maintains plain Java sets on every node, see > CacheDataStructuresManager.setDataMap. These sets duplicate backing-cache > entries, both primary and backup. size() and iterator() calls issue > distributed queries to collect/filter data from all setDataMap's. And > setDataMaps remain empty after cluster is recovered from checkpoint. > > Now I see the following options to fix the issue. > > #1 - Naive. Iterate over all datastructure-backing caches entries during > recover from checkpoint procedure, filter set-related entries and refill > setDataMap's. > Pros: easy to implement > Cons: inpredictable time/memory overhead. > > #2 - More realistic. Avoid node-local copies of cache data. Maintain linked > list in datastructure-backing cache: key is set item, value is next set > item. List head is stored in set header cache entry (this set item is > youngest one). Iterators build on top of this structure are fail-fast. > Pros: less memory overhead, no need to maintain node-local mirrors of cache > data > Cons: iterators are not fail-safe. > > #3 - Option #2 modified. We can store reference counter and 'removed' flag > along with next item reference. This allows to make iterators fail safe. > Pros: iterators are fail-safe > Cons: slightly more complicated implementation, may affect performance, > also I see no way to handle active iterators on remote nodes failures. > > > Best regards, > > Andrey. > |
Hello, Igniters!
We have some issues with current IgniteSet implementation ([1], [2], [3], [4]). As was already described in this conversation, the main problem is that current IgniteSet implementation maintains plain Java sets on every node (see CacheDataStructuresManager.setDataMap). These sets duplicate backing-cache entries, both primary and backup. size() and iterator() calls issue distributed queries to collect/filter data from all setDataMap's. I believe we can solve specified issues if each instance of IgniteSet will have separate internal cache that will be destroyed on close. What do you think about such major change? Do you have any thoughts or objections? [1] https://issues.apache.org/jira/browse/IGNITE-7565 [2] https://issues.apache.org/jira/browse/IGNITE-5370 [3] https://issues.apache.org/jira/browse/IGNITE-5553 [4] https://issues.apache.org/jira/browse/IGNITE-6474 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > Hi Andrey, > > Thanks for a detailed email. I think your suggestions do make sense. Ignite > cannot afford to have a distributed set that is not fail-safe. Can you > please focus only on solutions that provide consistent behavior in case of > topology changes and failures and document them in the ticket? > > https://issues.apache.org/jira/browse/IGNITE-5553 > > D. > > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <[hidden email]> wrote: > >> Hi, Igniters! >> >> Current implementation of IgniteSet is fragile with respect to cluster >> recovery from a checkpoint. We have an issue (IGNITE-5553) that addresses >> set's size() behavior, but the problem is slightly broader. The text below >> is my comment from Jira issue. I encourage you to discuss it. >> >> We can put current set size into set header cache entry. This will fix >> size(), but we have broken iterator() implementation as well. >> >> Currently, set implementation maintains plain Java sets on every node, see >> CacheDataStructuresManager.setDataMap. These sets duplicate backing-cache >> entries, both primary and backup. size() and iterator() calls issue >> distributed queries to collect/filter data from all setDataMap's. And >> setDataMaps remain empty after cluster is recovered from checkpoint. >> >> Now I see the following options to fix the issue. >> >> #1 - Naive. Iterate over all datastructure-backing caches entries during >> recover from checkpoint procedure, filter set-related entries and refill >> setDataMap's. >> Pros: easy to implement >> Cons: inpredictable time/memory overhead. >> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain linked >> list in datastructure-backing cache: key is set item, value is next set >> item. List head is stored in set header cache entry (this set item is >> youngest one). Iterators build on top of this structure are fail-fast. >> Pros: less memory overhead, no need to maintain node-local mirrors of cache >> data >> Cons: iterators are not fail-safe. >> >> #3 - Option #2 modified. We can store reference counter and 'removed' flag >> along with next item reference. This allows to make iterators fail safe. >> Pros: iterators are fail-safe >> Cons: slightly more complicated implementation, may affect performance, >> also I see no way to handle active iterators on remote nodes failures. >> >> >> Best regards, >> >> Andrey. >> |
Pavel,
I don't like an idea of creating separate cache for each data structure, especially for collocated ones. Can actually, I'm not sure I understand how that would help. It sounds like that we just need to properly persist the data structures cache and then reload on restart. -Val On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <[hidden email]> wrote: > Hello, Igniters! > > We have some issues with current IgniteSet implementation ([1], [2], [3], > [4]). > > As was already described in this conversation, the main problem is > that current IgniteSet implementation maintains plain Java sets on > every node (see CacheDataStructuresManager.setDataMap). These sets > duplicate backing-cache entries, both primary and backup. size() and > iterator() calls issue distributed queries to collect/filter data from > all setDataMap's. > > I believe we can solve specified issues if each instance of IgniteSet > will have separate internal cache that will be destroyed on close. > > What do you think about such major change? Do you have any thoughts or > objections? > > [1] https://issues.apache.org/jira/browse/IGNITE-7565 > [2] https://issues.apache.org/jira/browse/IGNITE-5370 > [3] https://issues.apache.org/jira/browse/IGNITE-5553 > [4] https://issues.apache.org/jira/browse/IGNITE-6474 > > > 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > Hi Andrey, > > > > Thanks for a detailed email. I think your suggestions do make sense. > Ignite > > cannot afford to have a distributed set that is not fail-safe. Can you > > please focus only on solutions that provide consistent behavior in case > of > > topology changes and failures and document them in the ticket? > > > > https://issues.apache.org/jira/browse/IGNITE-5553 > > > > D. > > > > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <[hidden email]> > wrote: > > > >> Hi, Igniters! > >> > >> Current implementation of IgniteSet is fragile with respect to cluster > >> recovery from a checkpoint. We have an issue (IGNITE-5553) that > addresses > >> set's size() behavior, but the problem is slightly broader. The text > below > >> is my comment from Jira issue. I encourage you to discuss it. > >> > >> We can put current set size into set header cache entry. This will fix > >> size(), but we have broken iterator() implementation as well. > >> > >> Currently, set implementation maintains plain Java sets on every node, > see > >> CacheDataStructuresManager.setDataMap. These sets duplicate > backing-cache > >> entries, both primary and backup. size() and iterator() calls issue > >> distributed queries to collect/filter data from all setDataMap's. And > >> setDataMaps remain empty after cluster is recovered from checkpoint. > >> > >> Now I see the following options to fix the issue. > >> > >> #1 - Naive. Iterate over all datastructure-backing caches entries during > >> recover from checkpoint procedure, filter set-related entries and refill > >> setDataMap's. > >> Pros: easy to implement > >> Cons: inpredictable time/memory overhead. > >> > >> #2 - More realistic. Avoid node-local copies of cache data. Maintain > linked > >> list in datastructure-backing cache: key is set item, value is next set > >> item. List head is stored in set header cache entry (this set item is > >> youngest one). Iterators build on top of this structure are fail-fast. > >> Pros: less memory overhead, no need to maintain node-local mirrors of > cache > >> data > >> Cons: iterators are not fail-safe. > >> > >> #3 - Option #2 modified. We can store reference counter and 'removed' > flag > >> along with next item reference. This allows to make iterators fail safe. > >> Pros: iterators are fail-safe > >> Cons: slightly more complicated implementation, may affect performance, > >> also I see no way to handle active iterators on remote nodes failures. > >> > >> > >> Best regards, > >> > >> Andrey. > >> > |
Hello, Valentin.
Thank you for the reply. As mentioned in this conversation, for now we have at least two issues with IgniteSet: 1. Incorrect behavior after recovery from PDS [1]. 2. The data in the cache is duplicated on-heap [2], which is not documented and lead to heap/GC overhead when using large Sets. Without significant changes, it is possible to solve [1] with the workaround, proposed by Andrey Kuznetsov - iterate over all datastructure-backing caches entries during recover from checkpoint procedure, filter set-related entries and refill setDataMap's. As a workaround for [2] we can add configuration option which data structure to use for "local caching" (on-heap or off-heap). If we go this way then cache data duplication will remain and some kind of off-heap ConcurrentHashMap should be implemented in Ignite (probably, already exists in some form, need to investigate this topic properly). On the other hand, if we use separate cache for each IgniteSet instance: 1. It will be not necessary to maintain redundant data stored somewhere other than the cache. 2. It will be not necessary to implement workaround for recovery from PDS. For the collocated mode we can, for example, enforce REPLICATED cache mode. Why don't you like the idea with separate cache? [1] https://issues.apache.org/jira/browse/IGNITE-7565 [2] https://issues.apache.org/jira/browse/IGNITE-5553 2018-02-09 0:44 GMT+03:00 Valentin Kulichenko <[hidden email]>: > Pavel, > > I don't like an idea of creating separate cache for each data structure, > especially for collocated ones. Can actually, I'm not sure I understand how > that would help. It sounds like that we just need to properly persist the > data structures cache and then reload on restart. > > -Val > > On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <[hidden email]> wrote: > >> Hello, Igniters! >> >> We have some issues with current IgniteSet implementation ([1], [2], [3], >> [4]). >> >> As was already described in this conversation, the main problem is >> that current IgniteSet implementation maintains plain Java sets on >> every node (see CacheDataStructuresManager.setDataMap). These sets >> duplicate backing-cache entries, both primary and backup. size() and >> iterator() calls issue distributed queries to collect/filter data from >> all setDataMap's. >> >> I believe we can solve specified issues if each instance of IgniteSet >> will have separate internal cache that will be destroyed on close. >> >> What do you think about such major change? Do you have any thoughts or >> objections? >> >> [1] https://issues.apache.org/jira/browse/IGNITE-7565 >> [2] https://issues.apache.org/jira/browse/IGNITE-5370 >> [3] https://issues.apache.org/jira/browse/IGNITE-5553 >> [4] https://issues.apache.org/jira/browse/IGNITE-6474 >> >> >> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: >> > Hi Andrey, >> > >> > Thanks for a detailed email. I think your suggestions do make sense. >> Ignite >> > cannot afford to have a distributed set that is not fail-safe. Can you >> > please focus only on solutions that provide consistent behavior in case >> of >> > topology changes and failures and document them in the ticket? >> > >> > https://issues.apache.org/jira/browse/IGNITE-5553 >> > >> > D. >> > >> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <[hidden email]> >> wrote: >> > >> >> Hi, Igniters! >> >> >> >> Current implementation of IgniteSet is fragile with respect to cluster >> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that >> addresses >> >> set's size() behavior, but the problem is slightly broader. The text >> below >> >> is my comment from Jira issue. I encourage you to discuss it. >> >> >> >> We can put current set size into set header cache entry. This will fix >> >> size(), but we have broken iterator() implementation as well. >> >> >> >> Currently, set implementation maintains plain Java sets on every node, >> see >> >> CacheDataStructuresManager.setDataMap. These sets duplicate >> backing-cache >> >> entries, both primary and backup. size() and iterator() calls issue >> >> distributed queries to collect/filter data from all setDataMap's. And >> >> setDataMaps remain empty after cluster is recovered from checkpoint. >> >> >> >> Now I see the following options to fix the issue. >> >> >> >> #1 - Naive. Iterate over all datastructure-backing caches entries during >> >> recover from checkpoint procedure, filter set-related entries and refill >> >> setDataMap's. >> >> Pros: easy to implement >> >> Cons: inpredictable time/memory overhead. >> >> >> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain >> linked >> >> list in datastructure-backing cache: key is set item, value is next set >> >> item. List head is stored in set header cache entry (this set item is >> >> youngest one). Iterators build on top of this structure are fail-fast. >> >> Pros: less memory overhead, no need to maintain node-local mirrors of >> cache >> >> data >> >> Cons: iterators are not fail-safe. >> >> >> >> #3 - Option #2 modified. We can store reference counter and 'removed' >> flag >> >> along with next item reference. This allows to make iterators fail safe. >> >> Pros: iterators are fail-safe >> >> Cons: slightly more complicated implementation, may affect performance, >> >> also I see no way to handle active iterators on remote nodes failures. >> >> >> >> >> >> Best regards, >> >> >> >> Andrey. >> >> >> |
Hi Pavel,
We have 2 types of data structures, collocated and non-collocated. The difference between them is that the collocated set is generally smaller and will always end up on the same node. Users generally will have many colllocated sets. On the other hand, a non-collocated set can span multiple nodes and therefore is able to store a lot more data. I can see how cache-per-set strategy can be applied to the non-collocated set. As a matter of fact, I would be surprised if it is not implemented that way already. However, I do not see this strategy applied to the collocated sets. Users can have 1000s of collocated sets or more. Are you suggesting that this will translate into 1000s of caches? D. On Fri, Feb 9, 2018 at 8:10 AM, Pavel Pereslegin <[hidden email]> wrote: > Hello, Valentin. > > Thank you for the reply. > > As mentioned in this conversation, for now we have at least two issues > with IgniteSet: > 1. Incorrect behavior after recovery from PDS [1]. > 2. The data in the cache is duplicated on-heap [2], which is not > documented and lead to heap/GC overhead when using large Sets. > > Without significant changes, it is possible to solve [1] with the > workaround, proposed by Andrey Kuznetsov - iterate over all > datastructure-backing caches entries during recover from checkpoint > procedure, filter set-related entries and refill setDataMap's. > As a workaround for [2] we can add configuration option which data > structure to use for "local caching" (on-heap or off-heap). > If we go this way then cache data duplication will remain and some > kind of off-heap ConcurrentHashMap should be implemented in Ignite > (probably, already exists in some form, need to investigate this topic > properly). > > On the other hand, if we use separate cache for each IgniteSet instance: > 1. It will be not necessary to maintain redundant data stored > somewhere other than the cache. > 2. It will be not necessary to implement workaround for recovery from PDS. > For the collocated mode we can, for example, enforce REPLICATED cache mode. > > Why don't you like the idea with separate cache? > > [1] https://issues.apache.org/jira/browse/IGNITE-7565 > [2] https://issues.apache.org/jira/browse/IGNITE-5553 > > > 2018-02-09 0:44 GMT+03:00 Valentin Kulichenko < > [hidden email]>: > > Pavel, > > > > I don't like an idea of creating separate cache for each data structure, > > especially for collocated ones. Can actually, I'm not sure I understand > how > > that would help. It sounds like that we just need to properly persist the > > data structures cache and then reload on restart. > > > > -Val > > > > On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <[hidden email]> > wrote: > > > >> Hello, Igniters! > >> > >> We have some issues with current IgniteSet implementation ([1], [2], > [3], > >> [4]). > >> > >> As was already described in this conversation, the main problem is > >> that current IgniteSet implementation maintains plain Java sets on > >> every node (see CacheDataStructuresManager.setDataMap). These sets > >> duplicate backing-cache entries, both primary and backup. size() and > >> iterator() calls issue distributed queries to collect/filter data from > >> all setDataMap's. > >> > >> I believe we can solve specified issues if each instance of IgniteSet > >> will have separate internal cache that will be destroyed on close. > >> > >> What do you think about such major change? Do you have any thoughts or > >> objections? > >> > >> [1] https://issues.apache.org/jira/browse/IGNITE-7565 > >> [2] https://issues.apache.org/jira/browse/IGNITE-5370 > >> [3] https://issues.apache.org/jira/browse/IGNITE-5553 > >> [4] https://issues.apache.org/jira/browse/IGNITE-6474 > >> > >> > >> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > >> > Hi Andrey, > >> > > >> > Thanks for a detailed email. I think your suggestions do make sense. > >> Ignite > >> > cannot afford to have a distributed set that is not fail-safe. Can you > >> > please focus only on solutions that provide consistent behavior in > case > >> of > >> > topology changes and failures and document them in the ticket? > >> > > >> > https://issues.apache.org/jira/browse/IGNITE-5553 > >> > > >> > D. > >> > > >> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov <[hidden email]> > >> wrote: > >> > > >> >> Hi, Igniters! > >> >> > >> >> Current implementation of IgniteSet is fragile with respect to > cluster > >> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that > >> addresses > >> >> set's size() behavior, but the problem is slightly broader. The text > >> below > >> >> is my comment from Jira issue. I encourage you to discuss it. > >> >> > >> >> We can put current set size into set header cache entry. This will > fix > >> >> size(), but we have broken iterator() implementation as well. > >> >> > >> >> Currently, set implementation maintains plain Java sets on every > node, > >> see > >> >> CacheDataStructuresManager.setDataMap. These sets duplicate > >> backing-cache > >> >> entries, both primary and backup. size() and iterator() calls issue > >> >> distributed queries to collect/filter data from all setDataMap's. And > >> >> setDataMaps remain empty after cluster is recovered from checkpoint. > >> >> > >> >> Now I see the following options to fix the issue. > >> >> > >> >> #1 - Naive. Iterate over all datastructure-backing caches entries > during > >> >> recover from checkpoint procedure, filter set-related entries and > refill > >> >> setDataMap's. > >> >> Pros: easy to implement > >> >> Cons: inpredictable time/memory overhead. > >> >> > >> >> #2 - More realistic. Avoid node-local copies of cache data. Maintain > >> linked > >> >> list in datastructure-backing cache: key is set item, value is next > set > >> >> item. List head is stored in set header cache entry (this set item is > >> >> youngest one). Iterators build on top of this structure are > fail-fast. > >> >> Pros: less memory overhead, no need to maintain node-local mirrors of > >> cache > >> >> data > >> >> Cons: iterators are not fail-safe. > >> >> > >> >> #3 - Option #2 modified. We can store reference counter and 'removed' > >> flag > >> >> along with next item reference. This allows to make iterators fail > safe. > >> >> Pros: iterators are fail-safe > >> >> Cons: slightly more complicated implementation, may affect > performance, > >> >> also I see no way to handle active iterators on remote nodes > failures. > >> >> > >> >> > >> >> Best regards, > >> >> > >> >> Andrey. > >> >> > >> > |
Pavel,
I'm a bit confused. In my understanding, issue exists because we have local in-memory maps which are used as the main source of truth about which structures currently exist. During restart, we lose all this data even if data structures cache(s) are persisted. Once we fix this, issue goes away, regardless of weather we store data structure per cache or everything in single cache. Am I missing something? I also agree with Dmitry. While cache per set approach can make sense for non-collocated sets, for collocated ones it definitely doesn't. So I would fix the original issue first, and then change the architecture if it's really needed. -Val On Fri, Feb 9, 2018 at 10:39 AM, Dmitriy Setrakyan <[hidden email]> wrote: > Hi Pavel, > > We have 2 types of data structures, collocated and non-collocated. The > difference between them is that the collocated set is generally smaller and > will always end up on the same node. Users generally will have many > colllocated sets. On the other hand, a non-collocated set can span multiple > nodes and therefore is able to store a lot more data. > > I can see how cache-per-set strategy can be applied to the non-collocated > set. As a matter of fact, I would be surprised if it is not implemented > that way already. > > However, I do not see this strategy applied to the collocated sets. Users > can have 1000s of collocated sets or more. Are you suggesting that this > will translate into 1000s of caches? > > D. > > On Fri, Feb 9, 2018 at 8:10 AM, Pavel Pereslegin <[hidden email]> wrote: > > > Hello, Valentin. > > > > Thank you for the reply. > > > > As mentioned in this conversation, for now we have at least two issues > > with IgniteSet: > > 1. Incorrect behavior after recovery from PDS [1]. > > 2. The data in the cache is duplicated on-heap [2], which is not > > documented and lead to heap/GC overhead when using large Sets. > > > > Without significant changes, it is possible to solve [1] with the > > workaround, proposed by Andrey Kuznetsov - iterate over all > > datastructure-backing caches entries during recover from checkpoint > > procedure, filter set-related entries and refill setDataMap's. > > As a workaround for [2] we can add configuration option which data > > structure to use for "local caching" (on-heap or off-heap). > > If we go this way then cache data duplication will remain and some > > kind of off-heap ConcurrentHashMap should be implemented in Ignite > > (probably, already exists in some form, need to investigate this topic > > properly). > > > > On the other hand, if we use separate cache for each IgniteSet instance: > > 1. It will be not necessary to maintain redundant data stored > > somewhere other than the cache. > > 2. It will be not necessary to implement workaround for recovery from > PDS. > > For the collocated mode we can, for example, enforce REPLICATED cache > mode. > > > > Why don't you like the idea with separate cache? > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7565 > > [2] https://issues.apache.org/jira/browse/IGNITE-5553 > > > > > > 2018-02-09 0:44 GMT+03:00 Valentin Kulichenko < > > [hidden email]>: > > > Pavel, > > > > > > I don't like an idea of creating separate cache for each data > structure, > > > especially for collocated ones. Can actually, I'm not sure I understand > > how > > > that would help. It sounds like that we just need to properly persist > the > > > data structures cache and then reload on restart. > > > > > > -Val > > > > > > On Thu, Feb 8, 2018 at 6:12 AM, Pavel Pereslegin <[hidden email]> > > wrote: > > > > > >> Hello, Igniters! > > >> > > >> We have some issues with current IgniteSet implementation ([1], [2], > > [3], > > >> [4]). > > >> > > >> As was already described in this conversation, the main problem is > > >> that current IgniteSet implementation maintains plain Java sets on > > >> every node (see CacheDataStructuresManager.setDataMap). These sets > > >> duplicate backing-cache entries, both primary and backup. size() and > > >> iterator() calls issue distributed queries to collect/filter data from > > >> all setDataMap's. > > >> > > >> I believe we can solve specified issues if each instance of IgniteSet > > >> will have separate internal cache that will be destroyed on close. > > >> > > >> What do you think about such major change? Do you have any thoughts or > > >> objections? > > >> > > >> [1] https://issues.apache.org/jira/browse/IGNITE-7565 > > >> [2] https://issues.apache.org/jira/browse/IGNITE-5370 > > >> [3] https://issues.apache.org/jira/browse/IGNITE-5553 > > >> [4] https://issues.apache.org/jira/browse/IGNITE-6474 > > >> > > >> > > >> 2017-10-31 5:53 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > >> > Hi Andrey, > > >> > > > >> > Thanks for a detailed email. I think your suggestions do make sense. > > >> Ignite > > >> > cannot afford to have a distributed set that is not fail-safe. Can > you > > >> > please focus only on solutions that provide consistent behavior in > > case > > >> of > > >> > topology changes and failures and document them in the ticket? > > >> > > > >> > https://issues.apache.org/jira/browse/IGNITE-5553 > > >> > > > >> > D. > > >> > > > >> > On Mon, Oct 30, 2017 at 3:07 AM, Andrey Kuznetsov < > [hidden email]> > > >> wrote: > > >> > > > >> >> Hi, Igniters! > > >> >> > > >> >> Current implementation of IgniteSet is fragile with respect to > > cluster > > >> >> recovery from a checkpoint. We have an issue (IGNITE-5553) that > > >> addresses > > >> >> set's size() behavior, but the problem is slightly broader. The > text > > >> below > > >> >> is my comment from Jira issue. I encourage you to discuss it. > > >> >> > > >> >> We can put current set size into set header cache entry. This will > > fix > > >> >> size(), but we have broken iterator() implementation as well. > > >> >> > > >> >> Currently, set implementation maintains plain Java sets on every > > node, > > >> see > > >> >> CacheDataStructuresManager.setDataMap. These sets duplicate > > >> backing-cache > > >> >> entries, both primary and backup. size() and iterator() calls issue > > >> >> distributed queries to collect/filter data from all setDataMap's. > And > > >> >> setDataMaps remain empty after cluster is recovered from > checkpoint. > > >> >> > > >> >> Now I see the following options to fix the issue. > > >> >> > > >> >> #1 - Naive. Iterate over all datastructure-backing caches entries > > during > > >> >> recover from checkpoint procedure, filter set-related entries and > > refill > > >> >> setDataMap's. > > >> >> Pros: easy to implement > > >> >> Cons: inpredictable time/memory overhead. > > >> >> > > >> >> #2 - More realistic. Avoid node-local copies of cache data. > Maintain > > >> linked > > >> >> list in datastructure-backing cache: key is set item, value is next > > set > > >> >> item. List head is stored in set header cache entry (this set item > is > > >> >> youngest one). Iterators build on top of this structure are > > fail-fast. > > >> >> Pros: less memory overhead, no need to maintain node-local mirrors > of > > >> cache > > >> >> data > > >> >> Cons: iterators are not fail-safe. > > >> >> > > >> >> #3 - Option #2 modified. We can store reference counter and > 'removed' > > >> flag > > >> >> along with next item reference. This allows to make iterators fail > > safe. > > >> >> Pros: iterators are fail-safe > > >> >> Cons: slightly more complicated implementation, may affect > > performance, > > >> >> also I see no way to handle active iterators on remote nodes > > failures. > > >> >> > > >> >> > > >> >> Best regards, > > >> >> > > >> >> Andrey. > > >> >> > > >> > > > |
Hi all,
Current set implementation has significant flaw: all set data are duplicated in onheap maps on _every_ node in order to make iterator() and size(). For me it looks like simple yet ineffective implementation. Currently, these maps are damaged by checkpointing/recovery, and we could patch them somehow. Another future change to Ignite caches can damage them again. This looks fragile when datastructure is not entirely backed by caches. Pavel's proposal seems to be a reliable solution for non-collocated sets. 9 февр. 2018 г. 22:46 пользователь "Valentin Kulichenko" < [hidden email]> написал: Pavel, I'm a bit confused. In my understanding, issue exists because we have local in-memory maps which are used as the main source of truth about which structures currently exist. During restart, we lose all this data even if data structures cache(s) are persisted. Once we fix this, issue goes away, regardless of weather we store data structure per cache or everything in single cache. Am I missing something? I also agree with Dmitry. While cache per set approach can make sense for non-collocated sets, for collocated ones it definitely doesn't. So I would fix the original issue first, and then change the architecture if it's really needed. -Val |
On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov <[hidden email]> wrote:
> Hi all, > > Current set implementation has significant flaw: all set data are > duplicated in onheap maps on _every_ node in order to make iterator() and > size(). For me it looks like simple yet ineffective implementation. > Currently, these maps are damaged by checkpointing/recovery, and we could > patch them somehow. Another future change to Ignite caches can damage them > again. This looks fragile when datastructure is not entirely backed by > caches. Pavel's proposal seems to be a reliable solution for non-collocated > sets. > I would agree, but I was under an impression that non-collocated sets are already implemented this way. Am I wrong? |
Indeed, all sets, regardless of whether they collocated or not, share
single cache, and also use onheap data structures irresistable to checkpointing/recovery. 2018-02-13 2:14 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov <[hidden email]> > wrote: > > > Hi all, > > > > Current set implementation has significant flaw: all set data are > > duplicated in onheap maps on _every_ node in order to make iterator() and > > size(). For me it looks like simple yet ineffective implementation. > > Currently, these maps are damaged by checkpointing/recovery, and we could > > patch them somehow. Another future change to Ignite caches can damage > them > > again. This looks fragile when datastructure is not entirely backed by > > caches. Pavel's proposal seems to be a reliable solution for > non-collocated > > sets. > > > > I would agree, but I was under an impression that non-collocated sets are > already implemented this way. Am I wrong? > -- Best regards, Andrey Kuznetsov. |
Hello, Igniters!
I agree that solution with separate caches is not acceptable for a large number of sets. So, I want to suggest one more way to implement IgniteSet that will introduce element indexes (similar to IgniteQueue). To implement this we can add head/tail indexes to IgniteSet header and for each IgniteSet element store two key-value pairs: setKey, index index, setKey Indexes are required to support iterator and they should be continuous. Please, see detailed description in JIRA comment [1]. With such approach add/remove/contains operations will have O(1) time complexity, iterator should work similar to current IgniteQueue iterator, issues [2], [3] will be resolved, because PDS recovery will work "out of the box" and we will not use JVM heap for duplicated values. Btw, we can use this implementation only for collocated mode (map keys/indexes to IgniteSet name) and use separate caches for non-collocated mode. What do you think about this? [1] https://issues.apache.org/jira/browse/IGNITE-5553#comment-16364043 [2] https://issues.apache.org/jira/browse/IGNITE-5553 [3] https://issues.apache.org/jira/browse/IGNITE-7565 2018-02-13 9:33 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > Indeed, all sets, regardless of whether they collocated or not, share > single cache, and also use onheap data structures irresistable to > checkpointing/recovery. > > 2018-02-13 2:14 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > >> On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov <[hidden email]> >> wrote: >> >> > Hi all, >> > >> > Current set implementation has significant flaw: all set data are >> > duplicated in onheap maps on _every_ node in order to make iterator() and >> > size(). For me it looks like simple yet ineffective implementation. >> > Currently, these maps are damaged by checkpointing/recovery, and we could >> > patch them somehow. Another future change to Ignite caches can damage >> them >> > again. This looks fragile when datastructure is not entirely backed by >> > caches. Pavel's proposal seems to be a reliable solution for >> non-collocated >> > sets. >> > >> >> I would agree, but I was under an impression that non-collocated sets are >> already implemented this way. Am I wrong? >> > > > > -- > Best regards, > Andrey Kuznetsov. |
I do not think indexes is the right approach - set do not have indexes, and
you will have to maintain additional counter for it in order to know when to stop. From what I see there are two distinct problems: 1) Broken recovery - this is just a bug which needs to be fixed. As soon as data is stored in real persistent cache, recovery of data structure state should be trivial task. 2) Heap items - this should not be a problem in common case when set contains moderate number of elements. If set is excessively large, then this is not the right structure for your use case and you should use standard IgniteCache API instead. What we can do is to optionally disable on-heap caching for specific set at the cost of lower performance if user wants so. On Wed, Feb 14, 2018 at 4:51 PM, Pavel Pereslegin <[hidden email]> wrote: > Hello, Igniters! > > I agree that solution with separate caches is not acceptable for a > large number of sets. > > So, I want to suggest one more way to implement IgniteSet that will > introduce element indexes (similar to IgniteQueue). To implement this > we can add head/tail indexes to IgniteSet header and for each > IgniteSet element store two key-value pairs: > setKey, index > index, setKey > > Indexes are required to support iterator and they should be continuous. > > Please, see detailed description in JIRA comment [1]. > > With such approach add/remove/contains operations will have O(1) time > complexity, iterator should work similar to current IgniteQueue > iterator, issues [2], [3] will be resolved, because PDS recovery will > work "out of the box" and we will not use JVM heap for duplicated > values. > > Btw, we can use this implementation only for collocated mode (map > keys/indexes to IgniteSet name) and use separate caches for > non-collocated mode. > > What do you think about this? > > [1] https://issues.apache.org/jira/browse/IGNITE-5553#comment-16364043 > [2] https://issues.apache.org/jira/browse/IGNITE-5553 > [3] https://issues.apache.org/jira/browse/IGNITE-7565 > > > 2018-02-13 9:33 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > > Indeed, all sets, regardless of whether they collocated or not, share > > single cache, and also use onheap data structures irresistable to > > checkpointing/recovery. > > > > 2018-02-13 2:14 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > > >> On Fri, Feb 9, 2018 at 6:26 PM, Andrey Kuznetsov <[hidden email]> > >> wrote: > >> > >> > Hi all, > >> > > >> > Current set implementation has significant flaw: all set data are > >> > duplicated in onheap maps on _every_ node in order to make iterator() > and > >> > size(). For me it looks like simple yet ineffective implementation. > >> > Currently, these maps are damaged by checkpointing/recovery, and we > could > >> > patch them somehow. Another future change to Ignite caches can damage > >> them > >> > again. This looks fragile when datastructure is not entirely backed by > >> > caches. Pavel's proposal seems to be a reliable solution for > >> non-collocated > >> > sets. > >> > > >> > >> I would agree, but I was under an impression that non-collocated sets > are > >> already implemented this way. Am I wrong? > >> > > > > > > > > -- > > Best regards, > > Andrey Kuznetsov. > |
On Thu, Feb 15, 2018 at 6:08 AM, Vladimir Ozerov <[hidden email]>
wrote: > I do not think indexes is the right approach - set do not have indexes, and > you will have to maintain additional counter for it in order to know when > to stop. > > From what I see there are two distinct problems: > 1) Broken recovery - this is just a bug which needs to be fixed. As soon as > data is stored in real persistent cache, recovery of data structure state > should be trivial task. > 2) Heap items - this should not be a problem in common case when set > contains moderate number of elements. If set is excessively large, then > this is not the right structure for your use case and you should use > standard IgniteCache API instead. What we can do is to optionally disable > on-heap caching for specific set at the cost of lower performance if user > wants so. > Vladimir, I am not sure I agree. In my view, set should be similar to cache, just a different API. I am not sure why we should make an assumptions that set data should be smaller than cache, especially given that it is a trivial task to implement a set based on Ignite cache API (we could just store key-key mappings in cache instead of key-value mappings internally). Can you clarify why you believe that IgniteSet should need to have on-heap entries? D. |
Hello Vladimir,
> What we can do is to optionally disable on-heap caching for specific set at the cost of lower performance if user wants so. I want to make sure that we are speaking about the same thing. By "on-heap caching" I mean custom datastructure and not standard on-heap cache, we can't disable it now, because it is the part of current IgniteSet implementation. Is it acceptable to use GridOffHeapMap optionally instead of ConcurrentHashMap for "off-heap mode"? 2018-02-15 23:20 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > On Thu, Feb 15, 2018 at 6:08 AM, Vladimir Ozerov <[hidden email]> > wrote: > >> I do not think indexes is the right approach - set do not have indexes, and >> you will have to maintain additional counter for it in order to know when >> to stop. >> >> From what I see there are two distinct problems: >> 1) Broken recovery - this is just a bug which needs to be fixed. As soon as >> data is stored in real persistent cache, recovery of data structure state >> should be trivial task. >> 2) Heap items - this should not be a problem in common case when set >> contains moderate number of elements. If set is excessively large, then >> this is not the right structure for your use case and you should use >> standard IgniteCache API instead. What we can do is to optionally disable >> on-heap caching for specific set at the cost of lower performance if user >> wants so. >> > > Vladimir, I am not sure I agree. In my view, set should be similar to > cache, just a different API. I am not sure why we should make an > assumptions that set data should be smaller than cache, especially given > that it is a trivial task to implement a set based on Ignite cache API (we > could just store key-key mappings in cache instead of key-value mappings > internally). > > Can you clarify why you believe that IgniteSet should need to have on-heap > entries? > > D. |
I think the root issue is that we are trying to mix different cases in a
single solution. What is the common usage patterns of sets? 1) Small mostly-read sets - current implementation is ideal for them - everything is available locally, on-heap and in deserialized form 2) Big data sets - IgniteCache API is the best candidate here; we can simply add a method "Set<K> IgniteCache.asSet()" which will return thin wrapper for Set interface around cache. An you will have recovery and heap-resistance with almost no additional efforts. There is no need to choose between one or another solution. We can support both. As far as current igniteSet implementation - all issues described above will go away if you use approach suggested by Dmitry. I do not see a need for any major changes. Vladimir. On Mon, Feb 19, 2018 at 10:44 AM, Pavel Pereslegin <[hidden email]> wrote: > Hello Vladimir, > > > What we can do is to optionally disable on-heap caching for specific set > at the cost of lower performance if user wants so. > > I want to make sure that we are speaking about the same thing. By > "on-heap caching" I mean custom datastructure and not standard on-heap > cache, we can't disable it now, because it is the part of current > IgniteSet implementation. Is it acceptable to use GridOffHeapMap > optionally instead of ConcurrentHashMap for "off-heap mode"? > > > 2018-02-15 23:20 GMT+03:00 Dmitriy Setrakyan <[hidden email]>: > > On Thu, Feb 15, 2018 at 6:08 AM, Vladimir Ozerov <[hidden email]> > > wrote: > > > >> I do not think indexes is the right approach - set do not have indexes, > and > >> you will have to maintain additional counter for it in order to know > when > >> to stop. > >> > >> From what I see there are two distinct problems: > >> 1) Broken recovery - this is just a bug which needs to be fixed. As > soon as > >> data is stored in real persistent cache, recovery of data structure > state > >> should be trivial task. > >> 2) Heap items - this should not be a problem in common case when set > >> contains moderate number of elements. If set is excessively large, then > >> this is not the right structure for your use case and you should use > >> standard IgniteCache API instead. What we can do is to optionally > disable > >> on-heap caching for specific set at the cost of lower performance if > user > >> wants so. > >> > > > > Vladimir, I am not sure I agree. In my view, set should be similar to > > cache, just a different API. I am not sure why we should make an > > assumptions that set data should be smaller than cache, especially given > > that it is a trivial task to implement a set based on Ignite cache API > (we > > could just store key-key mappings in cache instead of key-value mappings > > internally). > > > > Can you clarify why you believe that IgniteSet should need to have > on-heap > > entries? > > > > D. > |
As far as I know, Pavel P. is working on fixing existing sets currently.
As for {{asSet}} cache adapter, I filed the ticket [1]. [1] https://issues.apache.org/jira/browse/IGNITE-7823 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov <[hidden email]>: > I think the root issue is that we are trying to mix different cases in a > single solution. What is the common usage patterns of sets? > 1) Small mostly-read sets - current implementation is ideal for them - > everything is available locally, on-heap and in deserialized form > 2) Big data sets - IgniteCache API is the best candidate here; we can > simply add a method "Set<K> IgniteCache.asSet()" which will return thin > wrapper for Set interface around cache. An you will have recovery and > heap-resistance with almost no additional efforts. > There is no need to choose between one or another solution. We can support > both. > > As far as current igniteSet implementation - all issues described above > will go away if you use approach suggested by Dmitry. I do not see a need > for any major changes. > > Vladimir. > -- Best regards, Andrey Kuznetsov. |
Hello Igniters.
I'm working on the implementation of the IgniteCache#asSet method [1] and I think it should return Set (not IgniteSet). Because IgniteSet was introduced mainly to add methods for the collocated version of IgniteSet. Any thoughts? [1] https://issues.apache.org/jira/browse/IGNITE-7823 2018-02-27 14:59 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > As far as I know, Pavel P. is working on fixing existing sets currently. > > As for {{asSet}} cache adapter, I filed the ticket [1]. > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov <[hidden email]>: > >> I think the root issue is that we are trying to mix different cases in a >> single solution. What is the common usage patterns of sets? >> 1) Small mostly-read sets - current implementation is ideal for them - >> everything is available locally, on-heap and in deserialized form >> 2) Big data sets - IgniteCache API is the best candidate here; we can >> simply add a method "Set<K> IgniteCache.asSet()" which will return thin >> wrapper for Set interface around cache. An you will have recovery and >> heap-resistance with almost no additional efforts. >> There is no need to choose between one or another solution. We can support >> both. >> >> As far as current igniteSet implementation - all issues described above >> will go away if you use approach suggested by Dmitry. I do not see a need >> for any major changes. >> >> Vladimir. >> > > -- > Best regards, > Andrey Kuznetsov. |
Hi Pavel,
I did not quite understand the purpose of the new collection and the purpose of introduced new set. Is ticket descriptoin states, that it is for better documentation? Can we improve the task description https://issues.apache.org/jira/browse/IGNITE-7823 ? Sincerely, Dmitriy Pavlov ср, 14 мар. 2018 г. в 12:31, Pavel Pereslegin <[hidden email]>: > Hello Igniters. > > I'm working on the implementation of the IgniteCache#asSet method [1] > and I think it should return Set (not IgniteSet). Because IgniteSet > was introduced mainly to add methods for the collocated version of > IgniteSet. > > Any thoughts? > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > 2018-02-27 14:59 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > > As far as I know, Pavel P. is working on fixing existing sets currently. > > > > As for {{asSet}} cache adapter, I filed the ticket [1]. > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > > 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov <[hidden email]>: > > > >> I think the root issue is that we are trying to mix different cases in a > >> single solution. What is the common usage patterns of sets? > >> 1) Small mostly-read sets - current implementation is ideal for them - > >> everything is available locally, on-heap and in deserialized form > >> 2) Big data sets - IgniteCache API is the best candidate here; we can > >> simply add a method "Set<K> IgniteCache.asSet()" which will return thin > >> wrapper for Set interface around cache. An you will have recovery and > >> heap-resistance with almost no additional efforts. > >> There is no need to choose between one or another solution. We can > support > >> both. > >> > >> As far as current igniteSet implementation - all issues described above > >> will go away if you use approach suggested by Dmitry. I do not see a > need > >> for any major changes. > >> > >> Vladimir. > >> > > > > -- > > Best regards, > > Andrey Kuznetsov. > |
Hi, Dmitry.
The primary goal of the ticket is to implement {{IgniteCache::asSet}} view. The rationale is introduced earlier in this talk by Vladimir O. I've also mentioned the need to document that new method properly: it should be used to create large sets. Existing {{IgniteSets}} are good to represent small sets. 2018-03-14 20:05 GMT+03:00 Dmitry Pavlov <[hidden email]>: > Hi Pavel, > > I did not quite understand the purpose of the new collection and the > purpose of introduced new set. > > Is ticket descriptoin states, that it is for better documentation? > > Can we improve the task description > https://issues.apache.org/jira/browse/IGNITE-7823 ? > > Sincerely, > Dmitriy Pavlov > > ср, 14 мар. 2018 г. в 12:31, Pavel Pereslegin <[hidden email]>: > > > Hello Igniters. > > > > I'm working on the implementation of the IgniteCache#asSet method [1] > > and I think it should return Set (not IgniteSet). Because IgniteSet > > was introduced mainly to add methods for the collocated version of > > IgniteSet. > > > > Any thoughts? > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > > > > 2018-02-27 14:59 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > > > As far as I know, Pavel P. is working on fixing existing sets > currently. > > > > > > As for {{asSet}} cache adapter, I filed the ticket [1]. > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > > > > 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov <[hidden email]>: > > > > > >> I think the root issue is that we are trying to mix different cases > in a > > >> single solution. What is the common usage patterns of sets? > > >> 1) Small mostly-read sets - current implementation is ideal for them - > > >> everything is available locally, on-heap and in deserialized form > > >> 2) Big data sets - IgniteCache API is the best candidate here; we can > > >> simply add a method "Set<K> IgniteCache.asSet()" which will return > thin > > >> wrapper for Set interface around cache. An you will have recovery and > > >> heap-resistance with almost no additional efforts. > > >> There is no need to choose between one or another solution. We can > > support > > >> both. > > >> > > >> As far as current igniteSet implementation - all issues described > above > > >> will go away if you use approach suggested by Dmitry. I do not see a > > need > > >> for any major changes. > > >> > > >> Vladimir. > > >> > > > > > > -- > > > Best regards, > > > Andrey Kuznetsov. > > > -- Best regards, Andrey Kuznetsov. |
I am not sure I like the "asSet()" method. We already have Ignite.set(...)
method and now introducing yet another one. The new design should fix the existing implementation. We cannot keep the broken implementation around and introduce yet another one. To be consistent, we should also stick to the IgniteSet abstraction. Just to be clear, I am in agreement that a non-collocated set should be based on a cache, but I do not see why the existing API cannot be re-used. D. On Wed, Mar 14, 2018 at 1:26 PM, Andrey Kuznetsov <[hidden email]> wrote: > Hi, Dmitry. > > The primary goal of the ticket is to implement {{IgniteCache::asSet}} view. > The rationale is introduced earlier in this talk by Vladimir O. I've also > mentioned the need to document that new method properly: it should be used > to create large sets. Existing {{IgniteSets}} are good to represent small > sets. > > 2018-03-14 20:05 GMT+03:00 Dmitry Pavlov <[hidden email]>: > > > Hi Pavel, > > > > I did not quite understand the purpose of the new collection and the > > purpose of introduced new set. > > > > Is ticket descriptoin states, that it is for better documentation? > > > > Can we improve the task description > > https://issues.apache.org/jira/browse/IGNITE-7823 ? > > > > Sincerely, > > Dmitriy Pavlov > > > > ср, 14 мар. 2018 г. в 12:31, Pavel Pereslegin <[hidden email]>: > > > > > Hello Igniters. > > > > > > I'm working on the implementation of the IgniteCache#asSet method [1] > > > and I think it should return Set (not IgniteSet). Because IgniteSet > > > was introduced mainly to add methods for the collocated version of > > > IgniteSet. > > > > > > Any thoughts? > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > > > > > > > 2018-02-27 14:59 GMT+03:00 Andrey Kuznetsov <[hidden email]>: > > > > As far as I know, Pavel P. is working on fixing existing sets > > currently. > > > > > > > > As for {{asSet}} cache adapter, I filed the ticket [1]. > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-7823 > > > > > > > > 2018-02-27 11:20 GMT+03:00 Vladimir Ozerov <[hidden email]>: > > > > > > > >> I think the root issue is that we are trying to mix different cases > > in a > > > >> single solution. What is the common usage patterns of sets? > > > >> 1) Small mostly-read sets - current implementation is ideal for > them - > > > >> everything is available locally, on-heap and in deserialized form > > > >> 2) Big data sets - IgniteCache API is the best candidate here; we > can > > > >> simply add a method "Set<K> IgniteCache.asSet()" which will return > > thin > > > >> wrapper for Set interface around cache. An you will have recovery > and > > > >> heap-resistance with almost no additional efforts. > > > >> There is no need to choose between one or another solution. We can > > > support > > > >> both. > > > >> > > > >> As far as current igniteSet implementation - all issues described > > above > > > >> will go away if you use approach suggested by Dmitry. I do not see a > > > need > > > >> for any major changes. > > > >> > > > >> Vladimir. > > > >> > > > > > > > > -- > > > > Best regards, > > > > Andrey Kuznetsov. > > > > > > > > > -- > Best regards, > Andrey Kuznetsov. > |
Free forum by Nabble | Edit this page |