IgniteCache.invoke on ALL keys

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

IgniteCache.invoke on ALL keys

Pavel Tupitsyn-3
Igniters,

Looks like we do not have an efficient way to perform an action on EVERY
cache entry.

Let's say I want to remove all entries that match a predicate.
My only option is to retrieve these entries via Scan or SQL query, and then
call removeAll.
This involves a lot of unnecessary network trips (send keys to caller node,
send them back to primary nodes).

Would it be possible to implement a method like
void IgniteCache.invokeAll(entryProcessor)
that invokes the processor on all entries and does not return anything?
There could be more overloads that return results or only return results
for changed entries.

Thoughts?

Pavel.
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

dsetrakyan
I think we do support this use case. Why not send a computation to a server
and then perform the iteration through the cache entries locally on that
server?

On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <[hidden email]>
wrote:

> Igniters,
>
> Looks like we do not have an efficient way to perform an action on EVERY
> cache entry.
>
> Let's say I want to remove all entries that match a predicate.
> My only option is to retrieve these entries via Scan or SQL query, and then
> call removeAll.
> This involves a lot of unnecessary network trips (send keys to caller node,
> send them back to primary nodes).
>
> Would it be possible to implement a method like
> void IgniteCache.invokeAll(entryProcessor)
> that invokes the processor on all entries and does not return anything?
> There could be more overloads that return results or only return results
> for changed entries.
>
> Thoughts?
>
> Pavel.
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Pavel Tupitsyn-3
Dmitriy, as I understand, there is no reliable way to do that if
rebalancing happens.

On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> I think we do support this use case. Why not send a computation to a server
> and then perform the iteration through the cache entries locally on that
> server?
>
> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <[hidden email]>
> wrote:
>
> > Igniters,
> >
> > Looks like we do not have an efficient way to perform an action on EVERY
> > cache entry.
> >
> > Let's say I want to remove all entries that match a predicate.
> > My only option is to retrieve these entries via Scan or SQL query, and
> then
> > call removeAll.
> > This involves a lot of unnecessary network trips (send keys to caller
> node,
> > send them back to primary nodes).
> >
> > Would it be possible to implement a method like
> > void IgniteCache.invokeAll(entryProcessor)
> > that invokes the processor on all entries and does not return anything?
> > There could be more overloads that return results or only return results
> > for changed entries.
> >
> > Thoughts?
> >
> > Pavel.
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

dsetrakyan
Actually I have seen a ticket to block moving partitions if affinityCall or
affinityRun are called. I think once these tickets are implemented, the
process will become reliable, no?

D.

On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <[hidden email]>
wrote:

> Dmitriy, as I understand, there is no reliable way to do that if
> rebalancing happens.
>
> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> > I think we do support this use case. Why not send a computation to a
> server
> > and then perform the iteration through the cache entries locally on that
> > server?
> >
> > On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <[hidden email]>
> > wrote:
> >
> > > Igniters,
> > >
> > > Looks like we do not have an efficient way to perform an action on
> EVERY
> > > cache entry.
> > >
> > > Let's say I want to remove all entries that match a predicate.
> > > My only option is to retrieve these entries via Scan or SQL query, and
> > then
> > > call removeAll.
> > > This involves a lot of unnecessary network trips (send keys to caller
> > node,
> > > send them back to primary nodes).
> > >
> > > Would it be possible to implement a method like
> > > void IgniteCache.invokeAll(entryProcessor)
> > > that invokes the processor on all entries and does not return anything?
> > > There could be more overloads that return results or only return
> results
> > > for changed entries.
> > >
> > > Thoughts?
> > >
> > > Pavel.
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Vladimir Ozerov
Affinity run/call operate on a single key AFAIK.

On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Actually I have seen a ticket to block moving partitions if affinityCall or
> affinityRun are called. I think once these tickets are implemented, the
> process will become reliable, no?
>
> D.
>
> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <[hidden email]>
> wrote:
>
> > Dmitriy, as I understand, there is no reliable way to do that if
> > rebalancing happens.
> >
> > On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> [hidden email]>
> > wrote:
> >
> > > I think we do support this use case. Why not send a computation to a
> > server
> > > and then perform the iteration through the cache entries locally on
> that
> > > server?
> > >
> > > On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> [hidden email]>
> > > wrote:
> > >
> > > > Igniters,
> > > >
> > > > Looks like we do not have an efficient way to perform an action on
> > EVERY
> > > > cache entry.
> > > >
> > > > Let's say I want to remove all entries that match a predicate.
> > > > My only option is to retrieve these entries via Scan or SQL query,
> and
> > > then
> > > > call removeAll.
> > > > This involves a lot of unnecessary network trips (send keys to caller
> > > node,
> > > > send them back to primary nodes).
> > > >
> > > > Would it be possible to implement a method like
> > > > void IgniteCache.invokeAll(entryProcessor)
> > > > that invokes the processor on all entries and does not return
> anything?
> > > > There could be more overloads that return results or only return
> > results
> > > > for changed entries.
> > > >
> > > > Thoughts?
> > > >
> > > > Pavel.
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Yakov Zhdanov-2
Vova, even though it operates on single key we will need to "pin" partition
so key does not go to another node.

Pavel, you can also send closures to all primary nodes to do local scan
query for each partition. This way you will go over each entry.

Thanks!
--
Yakov Zhdanov, Director R&D
*GridGain Systems*
www.gridgain.com

2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:

> Affinity run/call operate on a single key AFAIK.
>
> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <[hidden email]
> >
> wrote:
>
> > Actually I have seen a ticket to block moving partitions if affinityCall
> or
> > affinityRun are called. I think once these tickets are implemented, the
> > process will become reliable, no?
> >
> > D.
> >
> > On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <[hidden email]>
> > wrote:
> >
> > > Dmitriy, as I understand, there is no reliable way to do that if
> > > rebalancing happens.
> > >
> > > On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> > [hidden email]>
> > > wrote:
> > >
> > > > I think we do support this use case. Why not send a computation to a
> > > server
> > > > and then perform the iteration through the cache entries locally on
> > that
> > > > server?
> > > >
> > > > On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Igniters,
> > > > >
> > > > > Looks like we do not have an efficient way to perform an action on
> > > EVERY
> > > > > cache entry.
> > > > >
> > > > > Let's say I want to remove all entries that match a predicate.
> > > > > My only option is to retrieve these entries via Scan or SQL query,
> > and
> > > > then
> > > > > call removeAll.
> > > > > This involves a lot of unnecessary network trips (send keys to
> caller
> > > > node,
> > > > > send them back to primary nodes).
> > > > >
> > > > > Would it be possible to implement a method like
> > > > > void IgniteCache.invokeAll(entryProcessor)
> > > > > that invokes the processor on all entries and does not return
> > anything?
> > > > > There could be more overloads that return results or only return
> > > results
> > > > > for changed entries.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Pavel.
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Denis Magda
Pavel,

Here is an example on how to execute scan queries on partitions’ owners and perform some operation for every entry
https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java <https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java>

When this feature is implemented [1] it will be possible to postpone partitions movement until such an operation is in progress. Presently if the rebalancing happens the data will be retrieved from a remote node (new partition owner), however the result will be consistent in any case.

[1] https://issues.apache.org/jira/browse/IGNITE-2310 <https://issues.apache.org/jira/browse/IGNITE-2310>


Denis

> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]> wrote:
>
> Vova, even though it operates on single key we will need to "pin" partition
> so key does not go to another node.
>
> Pavel, you can also send closures to all primary nodes to do local scan
> query for each partition. This way you will go over each entry.
>
> Thanks!
> --
> Yakov Zhdanov, Director R&D
> *GridGain Systems*
> www.gridgain.com
>
> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
>
>> Affinity run/call operate on a single key AFAIK.
>>
>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <[hidden email]
>>>
>> wrote:
>>
>>> Actually I have seen a ticket to block moving partitions if affinityCall
>> or
>>> affinityRun are called. I think once these tickets are implemented, the
>>> process will become reliable, no?
>>>
>>> D.
>>>
>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <[hidden email]>
>>> wrote:
>>>
>>>> Dmitriy, as I understand, there is no reliable way to do that if
>>>> rebalancing happens.
>>>>
>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
>>> [hidden email]>
>>>> wrote:
>>>>
>>>>> I think we do support this use case. Why not send a computation to a
>>>> server
>>>>> and then perform the iteration through the cache entries locally on
>>> that
>>>>> server?
>>>>>
>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
>>> [hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Igniters,
>>>>>>
>>>>>> Looks like we do not have an efficient way to perform an action on
>>>> EVERY
>>>>>> cache entry.
>>>>>>
>>>>>> Let's say I want to remove all entries that match a predicate.
>>>>>> My only option is to retrieve these entries via Scan or SQL query,
>>> and
>>>>> then
>>>>>> call removeAll.
>>>>>> This involves a lot of unnecessary network trips (send keys to
>> caller
>>>>> node,
>>>>>> send them back to primary nodes).
>>>>>>
>>>>>> Would it be possible to implement a method like
>>>>>> void IgniteCache.invokeAll(entryProcessor)
>>>>>> that invokes the processor on all entries and does not return
>>> anything?
>>>>>> There could be more overloads that return results or only return
>>>> results
>>>>>> for changed entries.
>>>>>>
>>>>>> Thoughts?
>>>>>>
>>>>>> Pavel.
>>>>>>
>>>>>
>>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Pavel Tupitsyn-3
Denis, thank you, this may work, but:
* it is too complicated
* it does not guarantee locality

This seems to be a common task, why don't we implement invokeAll as I
suggested above?

On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]> wrote:

> Pavel,
>
> Here is an example on how to execute scan queries on partitions’ owners
> and perform some operation for every entry
>
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> <
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> >
>
> When this feature is implemented [1] it will be possible to postpone
> partitions movement until such an operation is in progress. Presently if
> the rebalancing happens the data will be retrieved from a remote node (new
> partition owner), however the result will be consistent in any case.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
> https://issues.apache.org/jira/browse/IGNITE-2310>
>
> —
> Denis
>
> > On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]>
> wrote:
> >
> > Vova, even though it operates on single key we will need to "pin"
> partition
> > so key does not go to another node.
> >
> > Pavel, you can also send closures to all primary nodes to do local scan
> > query for each partition. This way you will go over each entry.
> >
> > Thanks!
> > --
> > Yakov Zhdanov, Director R&D
> > *GridGain Systems*
> > www.gridgain.com
> >
> > 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
> >
> >> Affinity run/call operate on a single key AFAIK.
> >>
> >> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
> [hidden email]
> >>>
> >> wrote:
> >>
> >>> Actually I have seen a ticket to block moving partitions if
> affinityCall
> >> or
> >>> affinityRun are called. I think once these tickets are implemented, the
> >>> process will become reliable, no?
> >>>
> >>> D.
> >>>
> >>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
> [hidden email]>
> >>> wrote:
> >>>
> >>>> Dmitriy, as I understand, there is no reliable way to do that if
> >>>> rebalancing happens.
> >>>>
> >>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> >>> [hidden email]>
> >>>> wrote:
> >>>>
> >>>>> I think we do support this use case. Why not send a computation to a
> >>>> server
> >>>>> and then perform the iteration through the cache entries locally on
> >>> that
> >>>>> server?
> >>>>>
> >>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> >>> [hidden email]>
> >>>>> wrote:
> >>>>>
> >>>>>> Igniters,
> >>>>>>
> >>>>>> Looks like we do not have an efficient way to perform an action on
> >>>> EVERY
> >>>>>> cache entry.
> >>>>>>
> >>>>>> Let's say I want to remove all entries that match a predicate.
> >>>>>> My only option is to retrieve these entries via Scan or SQL query,
> >>> and
> >>>>> then
> >>>>>> call removeAll.
> >>>>>> This involves a lot of unnecessary network trips (send keys to
> >> caller
> >>>>> node,
> >>>>>> send them back to primary nodes).
> >>>>>>
> >>>>>> Would it be possible to implement a method like
> >>>>>> void IgniteCache.invokeAll(entryProcessor)
> >>>>>> that invokes the processor on all entries and does not return
> >>> anything?
> >>>>>> There could be more overloads that return results or only return
> >>>> results
> >>>>>> for changed entries.
> >>>>>>
> >>>>>> Thoughts?
> >>>>>>
> >>>>>> Pavel.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Denis Magda
Pavel,

> This seems to be a common task, why don't we implement invokeAll as I
> suggested above?

I would implement such a method using the approach shown in the example. In my understanding it would be the most efficient way. Data locality will longer be not an issue when IGNITE-2310 is implemented.


Denis

> On May 31, 2016, at 1:16 PM, Pavel Tupitsyn <[hidden email]> wrote:
>
> Denis, thank you, this may work, but:
> * it is too complicated
> * it does not guarantee locality
>
> This seems to be a common task, why don't we implement invokeAll as I
> suggested above?
>
> On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]> wrote:
>
>> Pavel,
>>
>> Here is an example on how to execute scan queries on partitions’ owners
>> and perform some operation for every entry
>>
>> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
>> <
>> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
>>>
>>
>> When this feature is implemented [1] it will be possible to postpone
>> partitions movement until such an operation is in progress. Presently if
>> the rebalancing happens the data will be retrieved from a remote node (new
>> partition owner), however the result will be consistent in any case.
>>
>> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
>> https://issues.apache.org/jira/browse/IGNITE-2310>
>>
>> —
>> Denis
>>
>>> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]>
>> wrote:
>>>
>>> Vova, even though it operates on single key we will need to "pin"
>> partition
>>> so key does not go to another node.
>>>
>>> Pavel, you can also send closures to all primary nodes to do local scan
>>> query for each partition. This way you will go over each entry.
>>>
>>> Thanks!
>>> --
>>> Yakov Zhdanov, Director R&D
>>> *GridGain Systems*
>>> www.gridgain.com
>>>
>>> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
>>>
>>>> Affinity run/call operate on a single key AFAIK.
>>>>
>>>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
>> [hidden email]
>>>>>
>>>> wrote:
>>>>
>>>>> Actually I have seen a ticket to block moving partitions if
>> affinityCall
>>>> or
>>>>> affinityRun are called. I think once these tickets are implemented, the
>>>>> process will become reliable, no?
>>>>>
>>>>> D.
>>>>>
>>>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
>> [hidden email]>
>>>>> wrote:
>>>>>
>>>>>> Dmitriy, as I understand, there is no reliable way to do that if
>>>>>> rebalancing happens.
>>>>>>
>>>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
>>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> I think we do support this use case. Why not send a computation to a
>>>>>> server
>>>>>>> and then perform the iteration through the cache entries locally on
>>>>> that
>>>>>>> server?
>>>>>>>
>>>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
>>>>> [hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Igniters,
>>>>>>>>
>>>>>>>> Looks like we do not have an efficient way to perform an action on
>>>>>> EVERY
>>>>>>>> cache entry.
>>>>>>>>
>>>>>>>> Let's say I want to remove all entries that match a predicate.
>>>>>>>> My only option is to retrieve these entries via Scan or SQL query,
>>>>> and
>>>>>>> then
>>>>>>>> call removeAll.
>>>>>>>> This involves a lot of unnecessary network trips (send keys to
>>>> caller
>>>>>>> node,
>>>>>>>> send them back to primary nodes).
>>>>>>>>
>>>>>>>> Would it be possible to implement a method like
>>>>>>>> void IgniteCache.invokeAll(entryProcessor)
>>>>>>>> that invokes the processor on all entries and does not return
>>>>> anything?
>>>>>>>> There could be more overloads that return results or only return
>>>>>> results
>>>>>>>> for changed entries.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>> Pavel.
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Pavel Tupitsyn-3
Ok then, looks like there are no obstacles or objections, I've created a
JIRA ticket:
https://issues.apache.org/jira/browse/IGNITE-3222

Thanks,
Pavel.

On Tue, May 31, 2016 at 3:44 PM, Denis Magda <[hidden email]> wrote:

> Pavel,
>
> > This seems to be a common task, why don't we implement invokeAll as I
> > suggested above?
>
> I would implement such a method using the approach shown in the example.
> In my understanding it would be the most efficient way. Data locality will
> longer be not an issue when IGNITE-2310 is implemented.
>
> —
> Denis
>
> > On May 31, 2016, at 1:16 PM, Pavel Tupitsyn <[hidden email]>
> wrote:
> >
> > Denis, thank you, this may work, but:
> > * it is too complicated
> > * it does not guarantee locality
> >
> > This seems to be a common task, why don't we implement invokeAll as I
> > suggested above?
> >
> > On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]>
> wrote:
> >
> >> Pavel,
> >>
> >> Here is an example on how to execute scan queries on partitions’ owners
> >> and perform some operation for every entry
> >>
> >>
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> >> <
> >>
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> >>>
> >>
> >> When this feature is implemented [1] it will be possible to postpone
> >> partitions movement until such an operation is in progress. Presently if
> >> the rebalancing happens the data will be retrieved from a remote node
> (new
> >> partition owner), however the result will be consistent in any case.
> >>
> >> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
> >> https://issues.apache.org/jira/browse/IGNITE-2310>
> >>
> >> —
> >> Denis
> >>
> >>> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]>
> >> wrote:
> >>>
> >>> Vova, even though it operates on single key we will need to "pin"
> >> partition
> >>> so key does not go to another node.
> >>>
> >>> Pavel, you can also send closures to all primary nodes to do local scan
> >>> query for each partition. This way you will go over each entry.
> >>>
> >>> Thanks!
> >>> --
> >>> Yakov Zhdanov, Director R&D
> >>> *GridGain Systems*
> >>> www.gridgain.com
> >>>
> >>> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
> >>>
> >>>> Affinity run/call operate on a single key AFAIK.
> >>>>
> >>>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
> >> [hidden email]
> >>>>>
> >>>> wrote:
> >>>>
> >>>>> Actually I have seen a ticket to block moving partitions if
> >> affinityCall
> >>>> or
> >>>>> affinityRun are called. I think once these tickets are implemented,
> the
> >>>>> process will become reliable, no?
> >>>>>
> >>>>> D.
> >>>>>
> >>>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
> >> [hidden email]>
> >>>>> wrote:
> >>>>>
> >>>>>> Dmitriy, as I understand, there is no reliable way to do that if
> >>>>>> rebalancing happens.
> >>>>>>
> >>>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> >>>>> [hidden email]>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> I think we do support this use case. Why not send a computation to
> a
> >>>>>> server
> >>>>>>> and then perform the iteration through the cache entries locally on
> >>>>> that
> >>>>>>> server?
> >>>>>>>
> >>>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> >>>>> [hidden email]>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Igniters,
> >>>>>>>>
> >>>>>>>> Looks like we do not have an efficient way to perform an action on
> >>>>>> EVERY
> >>>>>>>> cache entry.
> >>>>>>>>
> >>>>>>>> Let's say I want to remove all entries that match a predicate.
> >>>>>>>> My only option is to retrieve these entries via Scan or SQL query,
> >>>>> and
> >>>>>>> then
> >>>>>>>> call removeAll.
> >>>>>>>> This involves a lot of unnecessary network trips (send keys to
> >>>> caller
> >>>>>>> node,
> >>>>>>>> send them back to primary nodes).
> >>>>>>>>
> >>>>>>>> Would it be possible to implement a method like
> >>>>>>>> void IgniteCache.invokeAll(entryProcessor)
> >>>>>>>> that invokes the processor on all entries and does not return
> >>>>> anything?
> >>>>>>>> There could be more overloads that return results or only return
> >>>>>> results
> >>>>>>>> for changed entries.
> >>>>>>>>
> >>>>>>>> Thoughts?
> >>>>>>>>
> >>>>>>>> Pavel.
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>
> >>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

dsetrakyan
Pavel,

I actually believe that such method will be error-prone and will cause all
sorts of memory issues for users trying to execute this method over large
caches.

What we need instead is an affinityCall/Run method over a partition, not a
key. Why not provide this method instead?

Added my comment to the ticket.

D.

On Tue, May 31, 2016 at 6:18 AM, Pavel Tupitsyn <[hidden email]>
wrote:

> Ok then, looks like there are no obstacles or objections, I've created a
> JIRA ticket:
> https://issues.apache.org/jira/browse/IGNITE-3222
>
> Thanks,
> Pavel.
>
> On Tue, May 31, 2016 at 3:44 PM, Denis Magda <[hidden email]> wrote:
>
> > Pavel,
> >
> > > This seems to be a common task, why don't we implement invokeAll as I
> > > suggested above?
> >
> > I would implement such a method using the approach shown in the example.
> > In my understanding it would be the most efficient way. Data locality
> will
> > longer be not an issue when IGNITE-2310 is implemented.
> >
> > —
> > Denis
> >
> > > On May 31, 2016, at 1:16 PM, Pavel Tupitsyn <[hidden email]>
> > wrote:
> > >
> > > Denis, thank you, this may work, but:
> > > * it is too complicated
> > > * it does not guarantee locality
> > >
> > > This seems to be a common task, why don't we implement invokeAll as I
> > > suggested above?
> > >
> > > On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]>
> > wrote:
> > >
> > >> Pavel,
> > >>
> > >> Here is an example on how to execute scan queries on partitions’
> owners
> > >> and perform some operation for every entry
> > >>
> > >>
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > >> <
> > >>
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > >>>
> > >>
> > >> When this feature is implemented [1] it will be possible to postpone
> > >> partitions movement until such an operation is in progress. Presently
> if
> > >> the rebalancing happens the data will be retrieved from a remote node
> > (new
> > >> partition owner), however the result will be consistent in any case.
> > >>
> > >> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
> > >> https://issues.apache.org/jira/browse/IGNITE-2310>
> > >>
> > >> —
> > >> Denis
> > >>
> > >>> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]>
> > >> wrote:
> > >>>
> > >>> Vova, even though it operates on single key we will need to "pin"
> > >> partition
> > >>> so key does not go to another node.
> > >>>
> > >>> Pavel, you can also send closures to all primary nodes to do local
> scan
> > >>> query for each partition. This way you will go over each entry.
> > >>>
> > >>> Thanks!
> > >>> --
> > >>> Yakov Zhdanov, Director R&D
> > >>> *GridGain Systems*
> > >>> www.gridgain.com
> > >>>
> > >>> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
> > >>>
> > >>>> Affinity run/call operate on a single key AFAIK.
> > >>>>
> > >>>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
> > >> [hidden email]
> > >>>>>
> > >>>> wrote:
> > >>>>
> > >>>>> Actually I have seen a ticket to block moving partitions if
> > >> affinityCall
> > >>>> or
> > >>>>> affinityRun are called. I think once these tickets are implemented,
> > the
> > >>>>> process will become reliable, no?
> > >>>>>
> > >>>>> D.
> > >>>>>
> > >>>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
> > >> [hidden email]>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> Dmitriy, as I understand, there is no reliable way to do that if
> > >>>>>> rebalancing happens.
> > >>>>>>
> > >>>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> > >>>>> [hidden email]>
> > >>>>>> wrote:
> > >>>>>>
> > >>>>>>> I think we do support this use case. Why not send a computation
> to
> > a
> > >>>>>> server
> > >>>>>>> and then perform the iteration through the cache entries locally
> on
> > >>>>> that
> > >>>>>>> server?
> > >>>>>>>
> > >>>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> > >>>>> [hidden email]>
> > >>>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Igniters,
> > >>>>>>>>
> > >>>>>>>> Looks like we do not have an efficient way to perform an action
> on
> > >>>>>> EVERY
> > >>>>>>>> cache entry.
> > >>>>>>>>
> > >>>>>>>> Let's say I want to remove all entries that match a predicate.
> > >>>>>>>> My only option is to retrieve these entries via Scan or SQL
> query,
> > >>>>> and
> > >>>>>>> then
> > >>>>>>>> call removeAll.
> > >>>>>>>> This involves a lot of unnecessary network trips (send keys to
> > >>>> caller
> > >>>>>>> node,
> > >>>>>>>> send them back to primary nodes).
> > >>>>>>>>
> > >>>>>>>> Would it be possible to implement a method like
> > >>>>>>>> void IgniteCache.invokeAll(entryProcessor)
> > >>>>>>>> that invokes the processor on all entries and does not return
> > >>>>> anything?
> > >>>>>>>> There could be more overloads that return results or only return
> > >>>>>> results
> > >>>>>>>> for changed entries.
> > >>>>>>>>
> > >>>>>>>> Thoughts?
> > >>>>>>>>
> > >>>>>>>> Pavel.
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> > >>
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

Pavel Tupitsyn-3
Dmitriy, affinityCall/Run over a partition is useful, I agree.

But dealing with partitions manually is also error-prone and not obvious at
all.
Invoke over all cache entries is a simple common task, why not provide a
simple method of doing it?

What kind of memory issues do you expect?
Returning EntryProcessorResult for each entry on a huge cache is an obvious
problem,
same as calling getAll on QueryCursor, for example. We can't protect users
from such things.

Pavel.

On Wed, Jun 1, 2016 at 3:18 AM, Dmitriy Setrakyan <[hidden email]>
wrote:

> Pavel,
>
> I actually believe that such method will be error-prone and will cause all
> sorts of memory issues for users trying to execute this method over large
> caches.
>
> What we need instead is an affinityCall/Run method over a partition, not a
> key. Why not provide this method instead?
>
> Added my comment to the ticket.
>
> D.
>
> On Tue, May 31, 2016 at 6:18 AM, Pavel Tupitsyn <[hidden email]>
> wrote:
>
> > Ok then, looks like there are no obstacles or objections, I've created a
> > JIRA ticket:
> > https://issues.apache.org/jira/browse/IGNITE-3222
> >
> > Thanks,
> > Pavel.
> >
> > On Tue, May 31, 2016 at 3:44 PM, Denis Magda <[hidden email]>
> wrote:
> >
> > > Pavel,
> > >
> > > > This seems to be a common task, why don't we implement invokeAll as I
> > > > suggested above?
> > >
> > > I would implement such a method using the approach shown in the
> example.
> > > In my understanding it would be the most efficient way. Data locality
> > will
> > > longer be not an issue when IGNITE-2310 is implemented.
> > >
> > > —
> > > Denis
> > >
> > > > On May 31, 2016, at 1:16 PM, Pavel Tupitsyn <[hidden email]>
> > > wrote:
> > > >
> > > > Denis, thank you, this may work, but:
> > > > * it is too complicated
> > > > * it does not guarantee locality
> > > >
> > > > This seems to be a common task, why don't we implement invokeAll as I
> > > > suggested above?
> > > >
> > > > On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]>
> > > wrote:
> > > >
> > > >> Pavel,
> > > >>
> > > >> Here is an example on how to execute scan queries on partitions’
> > owners
> > > >> and perform some operation for every entry
> > > >>
> > > >>
> > >
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > > >> <
> > > >>
> > >
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > > >>>
> > > >>
> > > >> When this feature is implemented [1] it will be possible to postpone
> > > >> partitions movement until such an operation is in progress.
> Presently
> > if
> > > >> the rebalancing happens the data will be retrieved from a remote
> node
> > > (new
> > > >> partition owner), however the result will be consistent in any case.
> > > >>
> > > >> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
> > > >> https://issues.apache.org/jira/browse/IGNITE-2310>
> > > >>
> > > >> —
> > > >> Denis
> > > >>
> > > >>> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <[hidden email]>
> > > >> wrote:
> > > >>>
> > > >>> Vova, even though it operates on single key we will need to "pin"
> > > >> partition
> > > >>> so key does not go to another node.
> > > >>>
> > > >>> Pavel, you can also send closures to all primary nodes to do local
> > scan
> > > >>> query for each partition. This way you will go over each entry.
> > > >>>
> > > >>> Thanks!
> > > >>> --
> > > >>> Yakov Zhdanov, Director R&D
> > > >>> *GridGain Systems*
> > > >>> www.gridgain.com
> > > >>>
> > > >>> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]>:
> > > >>>
> > > >>>> Affinity run/call operate on a single key AFAIK.
> > > >>>>
> > > >>>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
> > > >> [hidden email]
> > > >>>>>
> > > >>>> wrote:
> > > >>>>
> > > >>>>> Actually I have seen a ticket to block moving partitions if
> > > >> affinityCall
> > > >>>> or
> > > >>>>> affinityRun are called. I think once these tickets are
> implemented,
> > > the
> > > >>>>> process will become reliable, no?
> > > >>>>>
> > > >>>>> D.
> > > >>>>>
> > > >>>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
> > > >> [hidden email]>
> > > >>>>> wrote:
> > > >>>>>
> > > >>>>>> Dmitriy, as I understand, there is no reliable way to do that if
> > > >>>>>> rebalancing happens.
> > > >>>>>>
> > > >>>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> > > >>>>> [hidden email]>
> > > >>>>>> wrote:
> > > >>>>>>
> > > >>>>>>> I think we do support this use case. Why not send a computation
> > to
> > > a
> > > >>>>>> server
> > > >>>>>>> and then perform the iteration through the cache entries
> locally
> > on
> > > >>>>> that
> > > >>>>>>> server?
> > > >>>>>>>
> > > >>>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> > > >>>>> [hidden email]>
> > > >>>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Igniters,
> > > >>>>>>>>
> > > >>>>>>>> Looks like we do not have an efficient way to perform an
> action
> > on
> > > >>>>>> EVERY
> > > >>>>>>>> cache entry.
> > > >>>>>>>>
> > > >>>>>>>> Let's say I want to remove all entries that match a predicate.
> > > >>>>>>>> My only option is to retrieve these entries via Scan or SQL
> > query,
> > > >>>>> and
> > > >>>>>>> then
> > > >>>>>>>> call removeAll.
> > > >>>>>>>> This involves a lot of unnecessary network trips (send keys to
> > > >>>> caller
> > > >>>>>>> node,
> > > >>>>>>>> send them back to primary nodes).
> > > >>>>>>>>
> > > >>>>>>>> Would it be possible to implement a method like
> > > >>>>>>>> void IgniteCache.invokeAll(entryProcessor)
> > > >>>>>>>> that invokes the processor on all entries and does not return
> > > >>>>> anything?
> > > >>>>>>>> There could be more overloads that return results or only
> return
> > > >>>>>> results
> > > >>>>>>>> for changed entries.
> > > >>>>>>>>
> > > >>>>>>>> Thoughts?
> > > >>>>>>>>
> > > >>>>>>>> Pavel.
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>
> > > >>
> > > >>
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: IgniteCache.invoke on ALL keys

dsetrakyan
I responded in the ticket:
https://issues.apache.org/jira/browse/IGNITE-3222

On Thu, Jun 2, 2016 at 2:10 AM, Pavel Tupitsyn <[hidden email]>
wrote:

> Dmitriy, affinityCall/Run over a partition is useful, I agree.
>
> But dealing with partitions manually is also error-prone and not obvious at
> all.
> Invoke over all cache entries is a simple common task, why not provide a
> simple method of doing it?
>
> What kind of memory issues do you expect?
> Returning EntryProcessorResult for each entry on a huge cache is an obvious
> problem,
> same as calling getAll on QueryCursor, for example. We can't protect users
> from such things.
>
> Pavel.
>
> On Wed, Jun 1, 2016 at 3:18 AM, Dmitriy Setrakyan <[hidden email]>
> wrote:
>
> > Pavel,
> >
> > I actually believe that such method will be error-prone and will cause
> all
> > sorts of memory issues for users trying to execute this method over large
> > caches.
> >
> > What we need instead is an affinityCall/Run method over a partition, not
> a
> > key. Why not provide this method instead?
> >
> > Added my comment to the ticket.
> >
> > D.
> >
> > On Tue, May 31, 2016 at 6:18 AM, Pavel Tupitsyn <[hidden email]>
> > wrote:
> >
> > > Ok then, looks like there are no obstacles or objections, I've created
> a
> > > JIRA ticket:
> > > https://issues.apache.org/jira/browse/IGNITE-3222
> > >
> > > Thanks,
> > > Pavel.
> > >
> > > On Tue, May 31, 2016 at 3:44 PM, Denis Magda <[hidden email]>
> > wrote:
> > >
> > > > Pavel,
> > > >
> > > > > This seems to be a common task, why don't we implement invokeAll
> as I
> > > > > suggested above?
> > > >
> > > > I would implement such a method using the approach shown in the
> > example.
> > > > In my understanding it would be the most efficient way. Data locality
> > > will
> > > > longer be not an issue when IGNITE-2310 is implemented.
> > > >
> > > > —
> > > > Denis
> > > >
> > > > > On May 31, 2016, at 1:16 PM, Pavel Tupitsyn <
> [hidden email]>
> > > > wrote:
> > > > >
> > > > > Denis, thank you, this may work, but:
> > > > > * it is too complicated
> > > > > * it does not guarantee locality
> > > > >
> > > > > This seems to be a common task, why don't we implement invokeAll
> as I
> > > > > suggested above?
> > > > >
> > > > > On Tue, May 31, 2016 at 1:05 PM, Denis Magda <[hidden email]>
> > > > wrote:
> > > > >
> > > > >> Pavel,
> > > > >>
> > > > >> Here is an example on how to execute scan queries on partitions’
> > > owners
> > > > >> and perform some operation for every entry
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > > > >> <
> > > > >>
> > > >
> > >
> >
> https://github.com/gridgain/gridgain-advanced-examples/blob/master/src/main/java/org/gridgain/examples/datagrid/query/ScanQueryExample.java
> > > > >>>
> > > > >>
> > > > >> When this feature is implemented [1] it will be possible to
> postpone
> > > > >> partitions movement until such an operation is in progress.
> > Presently
> > > if
> > > > >> the rebalancing happens the data will be retrieved from a remote
> > node
> > > > (new
> > > > >> partition owner), however the result will be consistent in any
> case.
> > > > >>
> > > > >> [1] https://issues.apache.org/jira/browse/IGNITE-2310 <
> > > > >> https://issues.apache.org/jira/browse/IGNITE-2310>
> > > > >>
> > > > >> —
> > > > >> Denis
> > > > >>
> > > > >>> On May 31, 2016, at 7:42 AM, Yakov Zhdanov <
> [hidden email]>
> > > > >> wrote:
> > > > >>>
> > > > >>> Vova, even though it operates on single key we will need to "pin"
> > > > >> partition
> > > > >>> so key does not go to another node.
> > > > >>>
> > > > >>> Pavel, you can also send closures to all primary nodes to do
> local
> > > scan
> > > > >>> query for each partition. This way you will go over each entry.
> > > > >>>
> > > > >>> Thanks!
> > > > >>> --
> > > > >>> Yakov Zhdanov, Director R&D
> > > > >>> *GridGain Systems*
> > > > >>> www.gridgain.com
> > > > >>>
> > > > >>> 2016-05-30 16:05 GMT-04:00 Vladimir Ozerov <[hidden email]
> >:
> > > > >>>
> > > > >>>> Affinity run/call operate on a single key AFAIK.
> > > > >>>>
> > > > >>>> On Mon, May 30, 2016 at 10:55 PM, Dmitriy Setrakyan <
> > > > >> [hidden email]
> > > > >>>>>
> > > > >>>> wrote:
> > > > >>>>
> > > > >>>>> Actually I have seen a ticket to block moving partitions if
> > > > >> affinityCall
> > > > >>>> or
> > > > >>>>> affinityRun are called. I think once these tickets are
> > implemented,
> > > > the
> > > > >>>>> process will become reliable, no?
> > > > >>>>>
> > > > >>>>> D.
> > > > >>>>>
> > > > >>>>> On Mon, May 30, 2016 at 9:13 AM, Pavel Tupitsyn <
> > > > >> [hidden email]>
> > > > >>>>> wrote:
> > > > >>>>>
> > > > >>>>>> Dmitriy, as I understand, there is no reliable way to do that
> if
> > > > >>>>>> rebalancing happens.
> > > > >>>>>>
> > > > >>>>>> On Mon, May 30, 2016 at 6:50 PM, Dmitriy Setrakyan <
> > > > >>>>> [hidden email]>
> > > > >>>>>> wrote:
> > > > >>>>>>
> > > > >>>>>>> I think we do support this use case. Why not send a
> computation
> > > to
> > > > a
> > > > >>>>>> server
> > > > >>>>>>> and then perform the iteration through the cache entries
> > locally
> > > on
> > > > >>>>> that
> > > > >>>>>>> server?
> > > > >>>>>>>
> > > > >>>>>>> On Mon, May 30, 2016 at 4:44 AM, Pavel Tupitsyn <
> > > > >>>>> [hidden email]>
> > > > >>>>>>> wrote:
> > > > >>>>>>>
> > > > >>>>>>>> Igniters,
> > > > >>>>>>>>
> > > > >>>>>>>> Looks like we do not have an efficient way to perform an
> > action
> > > on
> > > > >>>>>> EVERY
> > > > >>>>>>>> cache entry.
> > > > >>>>>>>>
> > > > >>>>>>>> Let's say I want to remove all entries that match a
> predicate.
> > > > >>>>>>>> My only option is to retrieve these entries via Scan or SQL
> > > query,
> > > > >>>>> and
> > > > >>>>>>> then
> > > > >>>>>>>> call removeAll.
> > > > >>>>>>>> This involves a lot of unnecessary network trips (send keys
> to
> > > > >>>> caller
> > > > >>>>>>> node,
> > > > >>>>>>>> send them back to primary nodes).
> > > > >>>>>>>>
> > > > >>>>>>>> Would it be possible to implement a method like
> > > > >>>>>>>> void IgniteCache.invokeAll(entryProcessor)
> > > > >>>>>>>> that invokes the processor on all entries and does not
> return
> > > > >>>>> anything?
> > > > >>>>>>>> There could be more overloads that return results or only
> > return
> > > > >>>>>> results
> > > > >>>>>>>> for changed entries.
> > > > >>>>>>>>
> > > > >>>>>>>> Thoughts?
> > > > >>>>>>>>
> > > > >>>>>>>> Pavel.
> > > > >>>>>>>>
> > > > >>>>>>>
> > > > >>>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>
> > > > >>
> > > >
> > > >
> > >
> >
>