Apache Ignite Developers - Legacy Mail Archive

Re: Closure method on a Cache Query

Classic

List

Threaded

5 messages Options

yzhdanov

Re: Closure method on a Cache Query

I am crossposting this to dev list as well.

So, is there a way to introduce custom aggregate function? I assume no.

But even if we have such ability, writing and plugging such custom function
seem to be much harder to do than direct reducer/transformer from java
code. However, this is not very common usecase in SQL, I think.

I think it would be very nice to support custom aggregate functions. As far
as java-reducers - this is not very intuitive and I would skip adding them
back if custom aggregate is possible.

Sergi, Alex, can you please share your opinions?

--Yakov

2015-06-09 22:21 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:

> Also, if you need to return only a portion of the object (just a few
> fields), you can use SqlFieldsQuery and select only specific fields:
>
> select a, b, c from Person...
>
> D.
>
> On Tue, Jun 9, 2015 at 1:20 PM, fluffy <[hidden email]> wrote:
>
>> Using GridGain, I used to be able to associate a GridClosure method with a
>> GridCacheQuery. You could simply pass this Closure method into the
>> GridCacheQuery.execute() function and it would perform a function on each
>> cache element that matched the SQL query.
>>
>> This basically consolidated a number of tasks:
>>
>> - Instead of receiving entire objects from a query, a much smaller result
>> value was sent back to the node that initiated the query
>> - Allowed for the specification of some functionality within each Cache
>> Element rather than being a fairly dull data store
>> - Allowed for distributed processing of Cache Element information on the
>> node that each valid element existed before being aggregated/merged on the
>> calling node
>>
>> I do not see this functionality as having been ported to the Ignite
>> release.
>> At least not directly. Is there a way to do this in the current Ignite
>> version?
>>
>> I was looking at either a ScanQuery or using the AffinityRun methods, but
>> the later don't seem to allow me to perform an SQL query first to limit
>> the
>> Cache Elements....
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-tp456.html
>> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>>
>
>

Atri Sharma

Re: Closure method on a Cache Query

A problem I see with introducing custom aggregate function is problem of
serialization/deserialization of transition states (assuming aggregate
execution infrastructure looks like transition function -> transval ->
final function -> output). If we allow custom aggregate functions we may
also need to allow transition states to be "blackbox" for the system since
we are not aware of the transition states that might be needed by custom
aggregate logic for its operation. I see two solutions to this problem:

1) Allow only specific transition state types to be allowed (Bad idea. It
restricts the flexibility of custom aggregates that can be written.)
2) Make an umbrella type for all blackbox transition states. We can ensure
that for that type, serialization and deserialization functions are
provided by user.

Thoughts?

On Fri, Jun 12, 2015 at 2:14 PM, Yakov Zhdanov <[hidden email]> wrote:

> I am crossposting this to dev list as well.
>
> So, is there a way to introduce custom aggregate function? I assume no.
>
> But even if we have such ability, writing and plugging such custom function
> seem to be much harder to do than direct reducer/transformer from java
> code. However, this is not very common usecase in SQL, I think.
>
> I think it would be very nice to support custom aggregate functions. As far
> as java-reducers - this is not very intuitive and I would skip adding them
> back if custom aggregate is possible.
>
> Sergi, Alex, can you please share your opinions?
>
> --Yakov
>
> 2015-06-09 22:21 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>
> > Also, if you need to return only a portion of the object (just a few
> > fields), you can use SqlFieldsQuery and select only specific fields:
> >
> > select a, b, c from Person...
> >
> > D.
> >
> > On Tue, Jun 9, 2015 at 1:20 PM, fluffy <[hidden email]> wrote:
> >
> >> Using GridGain, I used to be able to associate a GridClosure method
> with a
> >> GridCacheQuery. You could simply pass this Closure method into the
> >> GridCacheQuery.execute() function and it would perform a function on
> each
> >> cache element that matched the SQL query.
> >>
> >> This basically consolidated a number of tasks:
> >>
> >> - Instead of receiving entire objects from a query, a much smaller
> result
> >> value was sent back to the node that initiated the query
> >> - Allowed for the specification of some functionality within each Cache
> >> Element rather than being a fairly dull data store
> >> - Allowed for distributed processing of Cache Element information on the
> >> node that each valid element existed before being aggregated/merged on
> the
> >> calling node
> >>
> >> I do not see this functionality as having been ported to the Ignite
> >> release.
> >> At least not directly. Is there a way to do this in the current Ignite
> >> version?
> >>
> >> I was looking at either a ScanQuery or using the AffinityRun methods,
> but
> >> the later don't seem to allow me to perform an SQL query first to limit
> >> the
> >> Cache Elements....
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >>
> http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-tp456.html
> >> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
> >>
> >
> >
>

--
Regards,

Atri
*l'apprenant*

Atri Sharma

Re: Closure method on a Cache Query

Unless of course, I am missing something

On Fri, Jun 12, 2015 at 2:51 PM, Atri Sharma <[hidden email]> wrote:

> A problem I see with introducing custom aggregate function is problem of
> serialization/deserialization of transition states (assuming aggregate
> execution infrastructure looks like transition function -> transval ->
> final function -> output). If we allow custom aggregate functions we may
> also need to allow transition states to be "blackbox" for the system since
> we are not aware of the transition states that might be needed by custom
> aggregate logic for its operation. I see two solutions to this problem:
>
> 1) Allow only specific transition state types to be allowed (Bad idea. It
> restricts the flexibility of custom aggregates that can be written.)
> 2) Make an umbrella type for all blackbox transition states. We can ensure
> that for that type, serialization and deserialization functions are
> provided by user.
>
> Thoughts?
>
> On Fri, Jun 12, 2015 at 2:14 PM, Yakov Zhdanov <[hidden email]>
> wrote:
>
>> I am crossposting this to dev list as well.
>>
>> So, is there a way to introduce custom aggregate function? I assume no.
>>
>> But even if we have such ability, writing and plugging such custom
>> function
>> seem to be much harder to do than direct reducer/transformer from java
>> code. However, this is not very common usecase in SQL, I think.
>>
>> I think it would be very nice to support custom aggregate functions. As
>> far
>> as java-reducers - this is not very intuitive and I would skip adding them
>> back if custom aggregate is possible.
>>
>> Sergi, Alex, can you please share your opinions?
>>
>> --Yakov
>>
>> 2015-06-09 22:21 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
>>
>> > Also, if you need to return only a portion of the object (just a few
>> > fields), you can use SqlFieldsQuery and select only specific fields:
>> >
>> > select a, b, c from Person...
>> >
>> > D.
>> >
>> > On Tue, Jun 9, 2015 at 1:20 PM, fluffy <[hidden email]> wrote:
>> >
>> >> Using GridGain, I used to be able to associate a GridClosure method
>> with a
>> >> GridCacheQuery. You could simply pass this Closure method into the
>> >> GridCacheQuery.execute() function and it would perform a function on
>> each
>> >> cache element that matched the SQL query.
>> >>
>> >> This basically consolidated a number of tasks:
>> >>
>> >> - Instead of receiving entire objects from a query, a much smaller
>> result
>> >> value was sent back to the node that initiated the query
>> >> - Allowed for the specification of some functionality within each Cache
>> >> Element rather than being a fairly dull data store
>> >> - Allowed for distributed processing of Cache Element information on
>> the
>> >> node that each valid element existed before being aggregated/merged on
>> the
>> >> calling node
>> >>
>> >> I do not see this functionality as having been ported to the Ignite
>> >> release.
>> >> At least not directly. Is there a way to do this in the current Ignite
>> >> version?
>> >>
>> >> I was looking at either a ScanQuery or using the AffinityRun methods,
>> but
>> >> the later don't seem to allow me to perform an SQL query first to limit
>> >> the
>> >> Cache Elements....
>> >>
>> >>
>> >>
>> >> --
>> >> View this message in context:
>> >>
>> http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-tp456.html
>> >> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>> >>
>> >
>> >
>>
>
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

--
Regards,

Atri
*l'apprenant*

dsetrakyan

Re: Closure method on a Cache Query

I think custom aggregation functions are not supported right now and will
require a special API to be supported. When user creates and implementation
of such API, we will require that it is Serializable, so we can transfer
any state a user may add.

D.

On Fri, Jun 12, 2015 at 2:15 PM, Atri Sharma <[hidden email]> wrote:

> Unless of course, I am missing something
>
> On Fri, Jun 12, 2015 at 2:51 PM, Atri Sharma <[hidden email]> wrote:
>
> > A problem I see with introducing custom aggregate function is problem of
> > serialization/deserialization of transition states (assuming aggregate
> > execution infrastructure looks like transition function -> transval ->
> > final function -> output). If we allow custom aggregate functions we may
> > also need to allow transition states to be "blackbox" for the system
> since
> > we are not aware of the transition states that might be needed by custom
> > aggregate logic for its operation. I see two solutions to this problem:
> >
> > 1) Allow only specific transition state types to be allowed (Bad idea. It
> > restricts the flexibility of custom aggregates that can be written.)
> > 2) Make an umbrella type for all blackbox transition states. We can
> ensure
> > that for that type, serialization and deserialization functions are
> > provided by user.
> >
> > Thoughts?
> >
> > On Fri, Jun 12, 2015 at 2:14 PM, Yakov Zhdanov <[hidden email]>
> > wrote:
> >
> >> I am crossposting this to dev list as well.
> >>
> >> So, is there a way to introduce custom aggregate function? I assume no.
> >>
> >> But even if we have such ability, writing and plugging such custom
> >> function
> >> seem to be much harder to do than direct reducer/transformer from java
> >> code. However, this is not very common usecase in SQL, I think.
> >>
> >> I think it would be very nice to support custom aggregate functions. As
> >> far
> >> as java-reducers - this is not very intuitive and I would skip adding
> them
> >> back if custom aggregate is possible.
> >>
> >> Sergi, Alex, can you please share your opinions?
> >>
> >> --Yakov
> >>
> >> 2015-06-09 22:21 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> >>
> >> > Also, if you need to return only a portion of the object (just a few
> >> > fields), you can use SqlFieldsQuery and select only specific fields:
> >> >
> >> > select a, b, c from Person...
> >> >
> >> > D.
> >> >
> >> > On Tue, Jun 9, 2015 at 1:20 PM, fluffy <[hidden email]> wrote:
> >> >
> >> >> Using GridGain, I used to be able to associate a GridClosure method
> >> with a
> >> >> GridCacheQuery. You could simply pass this Closure method into the
> >> >> GridCacheQuery.execute() function and it would perform a function on
> >> each
> >> >> cache element that matched the SQL query.
> >> >>
> >> >> This basically consolidated a number of tasks:
> >> >>
> >> >> - Instead of receiving entire objects from a query, a much smaller
> >> result
> >> >> value was sent back to the node that initiated the query
> >> >> - Allowed for the specification of some functionality within each
> Cache
> >> >> Element rather than being a fairly dull data store
> >> >> - Allowed for distributed processing of Cache Element information on
> >> the
> >> >> node that each valid element existed before being aggregated/merged
> on
> >> the
> >> >> calling node
> >> >>
> >> >> I do not see this functionality as having been ported to the Ignite
> >> >> release.
> >> >> At least not directly. Is there a way to do this in the current
> Ignite
> >> >> version?
> >> >>
> >> >> I was looking at either a ScanQuery or using the AffinityRun methods,
> >> but
> >> >> the later don't seem to allow me to perform an SQL query first to
> limit
> >> >> the
> >> >> Cache Elements....
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> View this message in context:
> >> >>
> >>
> http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-tp456.html
> >> >> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
> >> >>
> >> >
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>
>
>
> --
> Regards,
>
> Atri
> *l'apprenant*
>

Atri Sharma

Re: Closure method on a Cache Query

The serializable interface is what I think satisfies the serialization part
of what I mentioned.

I think we can go for the standard definition model (transfn + transval +
finalfn) for the custom aggregates.

I can help with the design and/or implementation if needed.

On Fri, Jun 12, 2015 at 6:07 PM, Dmitriy Setrakyan <[hidden email]>
wrote:

> I think custom aggregation functions are not supported right now and will
> require a special API to be supported. When user creates and implementation
> of such API, we will require that it is Serializable, so we can transfer
> any state a user may add.
>
> D.
>
> On Fri, Jun 12, 2015 at 2:15 PM, Atri Sharma <[hidden email]> wrote:
>
> > Unless of course, I am missing something
> >
> > On Fri, Jun 12, 2015 at 2:51 PM, Atri Sharma <[hidden email]>
> wrote:
> >
> > > A problem I see with introducing custom aggregate function is problem
> of
> > > serialization/deserialization of transition states (assuming aggregate
> > > execution infrastructure looks like transition function -> transval ->
> > > final function -> output). If we allow custom aggregate functions we
> may
> > > also need to allow transition states to be "blackbox" for the system
> > since
> > > we are not aware of the transition states that might be needed by
> custom
> > > aggregate logic for its operation. I see two solutions to this problem:
> > >
> > > 1) Allow only specific transition state types to be allowed (Bad idea.
> It
> > > restricts the flexibility of custom aggregates that can be written.)
> > > 2) Make an umbrella type for all blackbox transition states. We can
> > ensure
> > > that for that type, serialization and deserialization functions are
> > > provided by user.
> > >
> > > Thoughts?
> > >
> > > On Fri, Jun 12, 2015 at 2:14 PM, Yakov Zhdanov <[hidden email]>
> > > wrote:
> > >
> > >> I am crossposting this to dev list as well.
> > >>
> > >> So, is there a way to introduce custom aggregate function? I assume
> no.
> > >>
> > >> But even if we have such ability, writing and plugging such custom
> > >> function
> > >> seem to be much harder to do than direct reducer/transformer from
> java
> > >> code. However, this is not very common usecase in SQL, I think.
> > >>
> > >> I think it would be very nice to support custom aggregate functions.
> As
> > >> far
> > >> as java-reducers - this is not very intuitive and I would skip adding
> > them
> > >> back if custom aggregate is possible.
> > >>
> > >> Sergi, Alex, can you please share your opinions?
> > >>
> > >> --Yakov
> > >>
> > >> 2015-06-09 22:21 GMT+03:00 Dmitriy Setrakyan <[hidden email]>:
> > >>
> > >> > Also, if you need to return only a portion of the object (just a few
> > >> > fields), you can use SqlFieldsQuery and select only specific fields:
> > >> >
> > >> > select a, b, c from Person...
> > >> >
> > >> > D.
> > >> >
> > >> > On Tue, Jun 9, 2015 at 1:20 PM, fluffy <[hidden email]> wrote:
> > >> >
> > >> >> Using GridGain, I used to be able to associate a GridClosure method
> > >> with a
> > >> >> GridCacheQuery. You could simply pass this Closure method into the
> > >> >> GridCacheQuery.execute() function and it would perform a function
> on
> > >> each
> > >> >> cache element that matched the SQL query.
> > >> >>
> > >> >> This basically consolidated a number of tasks:
> > >> >>
> > >> >> - Instead of receiving entire objects from a query, a much smaller
> > >> result
> > >> >> value was sent back to the node that initiated the query
> > >> >> - Allowed for the specification of some functionality within each
> > Cache
> > >> >> Element rather than being a fairly dull data store
> > >> >> - Allowed for distributed processing of Cache Element information
> on
> > >> the
> > >> >> node that each valid element existed before being aggregated/merged
> > on
> > >> the
> > >> >> calling node
> > >> >>
> > >> >> I do not see this functionality as having been ported to the Ignite
> > >> >> release.
> > >> >> At least not directly. Is there a way to do this in the current
> > Ignite
> > >> >> version?
> > >> >>
> > >> >> I was looking at either a ScanQuery or using the AffinityRun
> methods,
> > >> but
> > >> >> the later don't seem to allow me to perform an SQL query first to
> > limit
> > >> >> the
> > >> >> Cache Elements....
> > >> >>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> View this message in context:
> > >> >>
> > >>
> >
> http://apache-ignite-users.70518.x6.nabble.com/Closure-method-on-a-Cache-Query-tp456.html
> > >> >> Sent from the Apache Ignite Users mailing list archive at
> Nabble.com.
> > >> >>
> > >> >
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Regards,
> > >
> > > Atri
> > > *l'apprenant*
> > >
> >
> >
> >
> > --
> > Regards,
> >
> > Atri
> > *l'apprenant*
> >
>

--
Regards,

Atri
*l'apprenant*