Apache Ignite Developers - Legacy Mail Archive

[ML] Distributed metrics computation

Classic

List

Threaded

6 messages Options

aplatonov

[ML] Distributed metrics computation

Hi Igniters!
I've been working on a prototype of distributed metrics computation for
ML-models. Unfortunately, we don't have an ability to compute metrics in a
distributed manner, so, it leads to gathering metric statistics to client
node via ScanQuery and all flow of vectors from partitions will be sent to
a client. I want to avoid such behavior and I propose the framework for
metrics computation using MapReduce approach based on an aggregation of
statistics for metrics.

I prepared an issue in Apache Jira for this:
https://issues.apache.org/jira/browse/IGNITE-12155
Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857
Currently, the work on this framework is still running but I'm going to
prepare full PR during this week.

By this email, I want to start a discussion about this idea.

Best regards,
Alexey Platonov

Nikolay Izhikov-2

Re: [ML] Distributed metrics computation

Hello, Alexey.

Why do we need distributed metrics in the first place?
It seems, there are many metric processing system out there: Prometheus, Zabbix, Splunk, etc.

Each of then can aggregate metrics in many ways.

I think, we should not use Ignite as an metrics aggregation system.

What do you think?

В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет:

> Hi Igniters!
> I've been working on a prototype of distributed metrics computation for
> ML-models. Unfortunately, we don't have an ability to compute metrics in a
> distributed manner, so, it leads to gathering metric statistics to client
> node via ScanQuery and all flow of vectors from partitions will be sent to
> a client. I want to avoid such behavior and I propose the framework for
> metrics computation using MapReduce approach based on an aggregation of
> statistics for metrics.
>
> I prepared an issue in Apache Jira for this:
> https://issues.apache.org/jira/browse/IGNITE-12155
> Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857
> Currently, the work on this framework is still running but I'm going to
> prepare full PR during this week.
>
> By this email, I want to start a discussion about this idea.
>
> Best regards,
> Alexey Platonov

signature.asc (499 bytes) Download Attachment

aplatonov

Re: [ML] Distributed metrics computation

I mean metrics for model evaluation like Accuracy or Precision/Recall for
ML models. It isn't same as system metrics (like throughput). Such metrics
should be computed over a test set after model training. if it is
interesting for you, please, have a look at this material:
https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy
between machine learning metrics and system metrics. We can't compute
ML-metrics via Zabbix for example.

Best regards,
Alexey Platonov

вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>:

> Hello, Alexey.
>
> Why do we need distributed metrics in the first place?
> It seems, there are many metric processing system out there: Prometheus,
> Zabbix, Splunk, etc.
>
> Each of then can aggregate metrics in many ways.
>
> I think, we should not use Ignite as an metrics aggregation system.
>
> What do you think?
>
> В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет:
> > Hi Igniters!
> > I've been working on a prototype of distributed metrics computation for
> > ML-models. Unfortunately, we don't have an ability to compute metrics in
> a
> > distributed manner, so, it leads to gathering metric statistics to client
> > node via ScanQuery and all flow of vectors from partitions will be sent
> to
> > a client. I want to avoid such behavior and I propose the framework for
> > metrics computation using MapReduce approach based on an aggregation of
> > statistics for metrics.
> >
> > I prepared an issue in Apache Jira for this:
> > https://issues.apache.org/jira/browse/IGNITE-12155
> > Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857
> > Currently, the work on this framework is still running but I'm going to
> > prepare full PR during this week.
> >
> > By this email, I want to start a discussion about this idea.
> >
> > Best regards,
> > Alexey Platonov
>

daradurvs

Re: [ML] Distributed metrics computation

Hi, Alexey,

I agree that Map-Reduce on demand looks more promising solution.
We can use Compute tasks for implementation.
'Map' phase can be tunned to process data by some trigger (dataset
update?) on ContiniousQuery manner and call 'Reduce' (with some
cache?) on demand.

On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]> wrote:

>
> I mean metrics for model evaluation like Accuracy or Precision/Recall for
> ML models. It isn't same as system metrics (like throughput). Such metrics
> should be computed over a test set after model training. if it is
> interesting for you, please, have a look at this material:
> https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy
> between machine learning metrics and system metrics. We can't compute
> ML-metrics via Zabbix for example.
>
> Best regards,
> Alexey Platonov
>
> вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>:
>
> > Hello, Alexey.
> >
> > Why do we need distributed metrics in the first place?
> > It seems, there are many metric processing system out there: Prometheus,
> > Zabbix, Splunk, etc.
> >
> > Each of then can aggregate metrics in many ways.
> >
> > I think, we should not use Ignite as an metrics aggregation system.
> >
> > What do you think?
> >
> > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет:
> > > Hi Igniters!
> > > I've been working on a prototype of distributed metrics computation for
> > > ML-models. Unfortunately, we don't have an ability to compute metrics in
> > a
> > > distributed manner, so, it leads to gathering metric statistics to client
> > > node via ScanQuery and all flow of vectors from partitions will be sent
> > to
> > > a client. I want to avoid such behavior and I propose the framework for
> > > metrics computation using MapReduce approach based on an aggregation of
> > > statistics for metrics.
> > >
> > > I prepared an issue in Apache Jira for this:
> > > https://issues.apache.org/jira/browse/IGNITE-12155
> > > Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857
> > > Currently, the work on this framework is still running but I'm going to
> > > prepare full PR during this week.
> > >
> > > By this email, I want to start a discussion about this idea.
> > >
> > > Best regards,
> > > Alexey Platonov
> >

--
Best Regards, Vyacheslav D.

aplatonov

Re: [ML] Distributed metrics computation

Hi, Vyacheslav,
Thanks for the advice. Actually, we already have the MapReduce approach
implementation in ML dataset and this implementation is based on compute
task. So, I think that I just can to reuse this solution.

Best regards,
Alexey Platonov

вт, 10 сент. 2019 г., 14:27 Vyacheslav Daradur <[hidden email]>:

> Hi, Alexey,
>
> I agree that Map-Reduce on demand looks more promising solution.
> We can use Compute tasks for implementation.
> 'Map' phase can be tunned to process data by some trigger (dataset
> update?) on ContiniousQuery manner and call 'Reduce' (with some
> cache?) on demand.
>
>
> On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]>
> wrote:
> >
> > I mean metrics for model evaluation like Accuracy or Precision/Recall for
> > ML models. It isn't same as system metrics (like throughput). Such
> metrics
> > should be computed over a test set after model training. if it is
> > interesting for you, please, have a look at this material:
> > https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy
> > between machine learning metrics and system metrics. We can't compute
> > ML-metrics via Zabbix for example.
> >
> > Best regards,
> > Alexey Platonov
> >
> > вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>:
> >
> > > Hello, Alexey.
> > >
> > > Why do we need distributed metrics in the first place?
> > > It seems, there are many metric processing system out there:
> Prometheus,
> > > Zabbix, Splunk, etc.
> > >
> > > Each of then can aggregate metrics in many ways.
> > >
> > > I think, we should not use Ignite as an metrics aggregation system.
> > >
> > > What do you think?
> > >
> > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет:
> > > > Hi Igniters!
> > > > I've been working on a prototype of distributed metrics computation
> for
> > > > ML-models. Unfortunately, we don't have an ability to compute
> metrics in
> > > a
> > > > distributed manner, so, it leads to gathering metric statistics to
> client
> > > > node via ScanQuery and all flow of vectors from partitions will be
> sent
> > > to
> > > > a client. I want to avoid such behavior and I propose the framework
> for
> > > > metrics computation using MapReduce approach based on an aggregation
> of
> > > > statistics for metrics.
> > > >
> > > > I prepared an issue in Apache Jira for this:
> > > > https://issues.apache.org/jira/browse/IGNITE-12155
> > > > Also, I prepared PR for it:
> https://github.com/apache/ignite/pull/6857
> > > > Currently, the work on this framework is still running but I'm going
> to
> > > > prepare full PR during this week.
> > > >
> > > > By this email, I want to start a discussion about this idea.
> > > >
> > > > Best regards,
> > > > Alexey Platonov
> > >
>
>
>
> --
> Best Regards, Vyacheslav D.
>

Alexey Zinoviev

Re: [ML] Distributed metrics computation

Dear Alexey, thank you for your PR, as an author of non-distributed metrics
should say, that it was fast solution to keep parity with Spark ML.
I have no time to implement it via our internal MR approach and your pR is
really helpful.

Dear Nikolay, there is another kind of metrics (not that was mentioned by
yourself), but like a metrics to evaluate Machine Learning Algorithms for
example like accuracy (how many times the machine predicted correctly) and
so on

Great PR, I will have a looooook tomorrow

вт, 10 сент. 2019 г. в 14:44, Алексей Платонов <[hidden email]>:

> Hi, Vyacheslav,
> Thanks for the advice. Actually, we already have the MapReduce approach
> implementation in ML dataset and this implementation is based on compute
> task. So, I think that I just can to reuse this solution.
>
> Best regards,
> Alexey Platonov
>
> вт, 10 сент. 2019 г., 14:27 Vyacheslav Daradur <[hidden email]>:
>
> > Hi, Alexey,
> >
> > I agree that Map-Reduce on demand looks more promising solution.
> > We can use Compute tasks for implementation.
> > 'Map' phase can be tunned to process data by some trigger (dataset
> > update?) on ContiniousQuery manner and call 'Reduce' (with some
> > cache?) on demand.
> >
> >
> > On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]>
> > wrote:
> > >
> > > I mean metrics for model evaluation like Accuracy or Precision/Recall
> for
> > > ML models. It isn't same as system metrics (like throughput). Such
> > metrics
> > > should be computed over a test set after model training. if it is
> > > interesting for you, please, have a look at this material:
> > > https://en.wikipedia.org/wiki/Precision_and_recall . It's just
> homonymy
> > > between machine learning metrics and system metrics. We can't compute
> > > ML-metrics via Zabbix for example.
> > >
> > > Best regards,
> > > Alexey Platonov
> > >
> > > вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>:
> > >
> > > > Hello, Alexey.
> > > >
> > > > Why do we need distributed metrics in the first place?
> > > > It seems, there are many metric processing system out there:
> > Prometheus,
> > > > Zabbix, Splunk, etc.
> > > >
> > > > Each of then can aggregate metrics in many ways.
> > > >
> > > > I think, we should not use Ignite as an metrics aggregation system.
> > > >
> > > > What do you think?
> > > >
> > > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет:
> > > > > Hi Igniters!
> > > > > I've been working on a prototype of distributed metrics computation
> > for
> > > > > ML-models. Unfortunately, we don't have an ability to compute
> > metrics in
> > > > a
> > > > > distributed manner, so, it leads to gathering metric statistics to
> > client
> > > > > node via ScanQuery and all flow of vectors from partitions will be
> > sent
> > > > to
> > > > > a client. I want to avoid such behavior and I propose the framework
> > for
> > > > > metrics computation using MapReduce approach based on an
> aggregation
> > of
> > > > > statistics for metrics.
> > > > >
> > > > > I prepared an issue in Apache Jira for this:
> > > > > https://issues.apache.org/jira/browse/IGNITE-12155
> > > > > Also, I prepared PR for it:
> > https://github.com/apache/ignite/pull/6857
> > > > > Currently, the work on this framework is still running but I'm
> going
> > to
> > > > > prepare full PR during this week.
> > > > >
> > > > > By this email, I want to start a discussion about this idea.
> > > > >
> > > > > Best regards,
> > > > > Alexey Platonov
> > > >
> >
> >
> >
> > --
> > Best Regards, Vyacheslav D.
> >
>