Hello,
I'd like to ask the community members to share their thoughts/opinions on the following subject. JCache provides a way to atomically execute one or more actions against a cache entry using the Entry Processor mechanism. The Cache interface exposes an invoke() method that takes a key of the cache entry to be acted upon and an instance of the EntryProcessor interface as its parameters. When invoked, the entry processor will be passed an instance of JCache's MutableEntry which gives the processor exclusive access to the cache entry for the duration of the EntryProcessor.process() call. Great feature for delta updates, in-place compute, coordination/agreement (a la zookeeper), and so on! Now, if one were to put his Ignite user hat on, what execution semantics would you as a user expect? Specifically, 1) the EntryProcessor is executed on the key's primary node as well as all backup nodes. 2) the EntryProcessor is executed only on the key's primary node. 3) something else. Unfortunately JCache spec doesn't provide much details on this feature. Ignite documentation is silent, too. Thanks Andrey |
Hi Andrey!
Could you please clarify in what case are you interested in: 1. cache entries backup related to CacheConfiguration#getBackups amount 2. just a remote nodes with some other keys Thanks! On Wed, Nov 25, 2015 at 8:43 PM, Andrey Kornev <[hidden email]> wrote: > Hello, > > I'd like to ask the community members to share their thoughts/opinions on > the following subject. > > JCache provides a way to atomically execute one or more actions against a > cache entry using the Entry Processor mechanism. The Cache interface > exposes an invoke() method that takes a key of the cache entry to be acted > upon and an instance of the EntryProcessor interface as its parameters. > When invoked, the entry processor will be passed an instance of JCache's > MutableEntry which gives the processor exclusive access to the cache entry > for the duration of the EntryProcessor.process() call. Great feature for > delta updates, in-place compute, coordination/agreement (a la zookeeper), > and so on! > > Now, if one were to put his Ignite user hat on, what execution semantics > would you as a user expect? Specifically, > 1) the EntryProcessor is executed on the key's primary node as well as all > backup nodes. > 2) the EntryProcessor is executed only on the key's primary node. > 3) something else. > > Unfortunately JCache spec doesn't provide much details on this feature. > Ignite documentation is silent, too. > > Thanks > Andrey > |
Hi Vladimir,
My question was related to your expectations as a user of JCache API. Specifically, if you were to use the JCache's entry processor feature, where and when would you expect the EntryProcessor be executed once you call Cache.invoke() method? I wonder if there is anyone here who's used this API? The reason I'm asking this question is that I believe the current implementation of this feature in Ignite strikes me as strange, to say the least. And due to lack of details/guidance in the JCache spec as to how this feature to be implemented the only thing left is to ask the community for their opinion/experience. Based on my previous experience with Coherence's implementation of this feature I expected the same behavior from Ignite's. But alas, Ignite has its own -- different -- opinion how it should be done. :) Regards Andrey > Date: Thu, 26 Nov 2015 20:59:48 +0300 > Subject: Re: EntryProcessor execution semantics > From: [hidden email] > To: [hidden email] > > Hi Andrey! > > Could you please clarify in what case are you interested in: > > 1. cache entries backup related to CacheConfiguration#getBackups amount > 2. just a remote nodes with some other keys > > Thanks! > > On Wed, Nov 25, 2015 at 8:43 PM, Andrey Kornev <[hidden email]> > wrote: > > > Hello, > > > > I'd like to ask the community members to share their thoughts/opinions on > > the following subject. > > > > JCache provides a way to atomically execute one or more actions against a > > cache entry using the Entry Processor mechanism. The Cache interface > > exposes an invoke() method that takes a key of the cache entry to be acted > > upon and an instance of the EntryProcessor interface as its parameters. > > When invoked, the entry processor will be passed an instance of JCache's > > MutableEntry which gives the processor exclusive access to the cache entry > > for the duration of the EntryProcessor.process() call. Great feature for > > delta updates, in-place compute, coordination/agreement (a la zookeeper), > > and so on! > > > > Now, if one were to put his Ignite user hat on, what execution semantics > > would you as a user expect? Specifically, > > 1) the EntryProcessor is executed on the key's primary node as well as all > > backup nodes. > > 2) the EntryProcessor is executed only on the key's primary node. > > 3) something else. > > > > Unfortunately JCache spec doesn't provide much details on this feature. > > Ignite documentation is silent, too. > > > > Thanks > > Andrey > > |
Andrey,
If I leave behind my knowledge about Ignite internals, my expectation would be that an EntryProcessor is invoked on all affinity - both primary and backup - nodes in the grid. The main reason behind this expectation is that usually a serialized EntryProcessor instance is smaller than resulting object being stored in the cache, so sending a serialized EntryProcessor should be cheaper. Is there a specific reason you expect an EntryProcessor to be called only once across all the nodes? I would not imply any restrictions on how many times an EntryProcessor is called during a cache update. For example, in a case of explicit optimistic READ_COMMITTED transaction it may be called more than once because Ignite needs to calculate a return value for the first invoke() and then it should be called second time during commit when transactional locks are held. Current requirement is that an EntryProcessor should be a stateless function, and it may be called more than once (but of course it will receive the same cache value every time). I agree that this should be properly articulated in the documentation, I will make sure that it will be reflected in the forthcoming 1.5 release javadocs. |
Thank you, Alexey!
By stating that "sending a serialized EntryProcessor should be cheaper" you implicitly assume that the cache entry is big and the computation done by the processor is cheap. But what if it's not the case? What if the computation itself is quite expensive and depends on external data (which may happen to be constantly changing -- like the stock tickers?), or is done for a side effect? What is the EP feature good for after all, given the constraints you posed below? Incrementing an integer counter, as the example in Ignite documentation does? :) Of course, JCache specification is open to interpretation, and one might argue that the EntryProcessor is a performance feature, but my reading of the spec makes me think (and it looks like both Coherence and Hazelcast agree with me) that it's first and foremost a way to atomically mutate a cache entry without incurring an overhead of locking. Let's see now. A single call to Cache.invoke() produces - a single EP invocation on the key's primary node in Coherence. Period. - a single EP invocation on the key's primary node in Hazelcast, but they offer the non-JCache BackupAwareEntryProcessor class that allows the user "to create or pass another EntryProcessor to run on backup partitions and apply delta changes to the backup entries". - In Ignite: -- a single invocation on the key's primary node if the cache is ATOMIC (both REPLICATED and PARTITIONED). -- N+1 invocations (where N is the number of nodes the cache is started on) if the cache is REPLICATED and TRANSACTIONAL. -- B+2 invocations (where B is the number of replicas) if the cache is PARTITIONED and TRANSACTIONAL. Go figure! Alexey, you're suggesting that a user without deep knowledge of Ignite internals would find such behavior expected and natural? Even with deep knowledge of Ignite internals it's hard to understand the logic. Neither Coherence nor Hazelcast require the EP to be stateless and side-effect free. Even better Hazelcast makes the choice explicit by providing the backup aware processor API and it's then up to the user to ensure statelessness etc. But Ignite is just too clever. I'd really like to ask the brains behind the current design to reconsider. Regards Andrey > Date: Mon, 30 Nov 2015 13:11:13 +0300 > Subject: Re: EntryProcessor execution semantics > From: [hidden email] > To: [hidden email] > > Andrey, > > If I leave behind my knowledge about Ignite internals, my expectation would > be that an EntryProcessor is invoked on all affinity - both primary and > backup - nodes in the grid. The main reason behind this expectation is that > usually a serialized EntryProcessor instance is smaller than resulting > object being stored in the cache, so sending a serialized EntryProcessor > should be cheaper. Is there a specific reason you expect an EntryProcessor > to be called only once across all the nodes? > > I would not imply any restrictions on how many times an EntryProcessor is > called during a cache update. For example, in a case of explicit optimistic > READ_COMMITTED transaction it may be called more than once because Ignite > needs to calculate a return value for the first invoke() and then it should > be called second time during commit when transactional locks are held. > > Current requirement is that an EntryProcessor should be a stateless > function, and it may be called more than once (but of course it will > receive the same cache value every time). I agree that this should be > properly articulated in the documentation, I will make sure that it will be > reflected in the forthcoming 1.5 release javadocs. |
On Mon, Nov 30, 2015 at 9:02 AM, Andrey Kornev <[hidden email]>
wrote: > > Neither Coherence nor Hazelcast require the EP to be stateless and > side-effect free. Even better Hazelcast makes the choice explicit by > providing the backup aware processor API and it's then up to the user to > ensure statelessness etc. But Ignite is just too clever. > Andrey, stateful EP seems a bit utopian to me, since the state would not survive between executions anyway. Can you elaborate? |
Dmitriy,
Here, by "stateless" I meant whatever Alexey meant in his previous post in this thread. But I'm really talking about being able to have EPs with side effects and therefore the execution semantics should be "exactly-once" by default. Besides, maybe it's just me, but intuitively the expectation of Cache.invoke() is that the EP will be executed only once because *logically* there can only be one entry with the given key in the cache to which the EP is applied. Having the EP executed many times for the same entry comes as a big surprise, at least to me. Maybe it's worth considering an API similar to what Hazelcast has to make it possible to explicitly control EP's execution semantics. Regards Andrey > From: [hidden email] > Date: Mon, 30 Nov 2015 23:16:58 -0800 > Subject: Re: EntryProcessor execution semantics > To: [hidden email] > > On Mon, Nov 30, 2015 at 9:02 AM, Andrey Kornev <[hidden email]> > wrote: > > > > > Neither Coherence nor Hazelcast require the EP to be stateless and > > side-effect free. Even better Hazelcast makes the choice explicit by > > providing the backup aware processor API and it's then up to the user to > > ensure statelessness etc. But Ignite is just too clever. > > > > Andrey, stateful EP seems a bit utopian to me, since the state would not > survive between executions anyway. Can you elaborate? |
On Tue, Dec 1, 2015 at 8:34 AM, Andrey Kornev <[hidden email]>
wrote: > Dmitriy, > > Here, by "stateless" I meant whatever Alexey meant in his previous post in > this thread. But I'm really talking about being able to have EPs with side > effects and therefore the execution semantics should be "exactly-once" by > default. Besides, maybe it's just me, but intuitively the expectation of > Cache.invoke() is that the EP will be executed only once because > *logically* there can only be one entry with the given key in the cache to > which the EP is applied. Having the EP executed many times for the same > entry comes as a big surprise, at least to me. > > Maybe it's worth considering an API similar to what Hazelcast has to make > it possible to explicitly control EP's execution semantics. > Andrey, can you create a ticket and propose a design? We could continue this discussion there. > > Regards > Andrey > > > From: [hidden email] > > Date: Mon, 30 Nov 2015 23:16:58 -0800 > > Subject: Re: EntryProcessor execution semantics > > To: [hidden email] > > > > On Mon, Nov 30, 2015 at 9:02 AM, Andrey Kornev <[hidden email] > > > > wrote: > > > > > > > > Neither Coherence nor Hazelcast require the EP to be stateless and > > > side-effect free. Even better Hazelcast makes the choice explicit by > > > providing the backup aware processor API and it's then up to the user > to > > > ensure statelessness etc. But Ignite is just too clever. > > > > > > > Andrey, stateful EP seems a bit utopian to me, since the state would not > > survive between executions anyway. Can you elaborate? > > |
Free forum by Nabble | Edit this page |