Hi Igniters!
I've been working on a prototype of distributed metrics computation for ML-models. Unfortunately, we don't have an ability to compute metrics in a distributed manner, so, it leads to gathering metric statistics to client node via ScanQuery and all flow of vectors from partitions will be sent to a client. I want to avoid such behavior and I propose the framework for metrics computation using MapReduce approach based on an aggregation of statistics for metrics. I prepared an issue in Apache Jira for this: https://issues.apache.org/jira/browse/IGNITE-12155 Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857 Currently, the work on this framework is still running but I'm going to prepare full PR during this week. By this email, I want to start a discussion about this idea. Best regards, Alexey Platonov |
Hello, Alexey.
Why do we need distributed metrics in the first place? It seems, there are many metric processing system out there: Prometheus, Zabbix, Splunk, etc. Each of then can aggregate metrics in many ways. I think, we should not use Ignite as an metrics aggregation system. What do you think? В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет: > Hi Igniters! > I've been working on a prototype of distributed metrics computation for > ML-models. Unfortunately, we don't have an ability to compute metrics in a > distributed manner, so, it leads to gathering metric statistics to client > node via ScanQuery and all flow of vectors from partitions will be sent to > a client. I want to avoid such behavior and I propose the framework for > metrics computation using MapReduce approach based on an aggregation of > statistics for metrics. > > I prepared an issue in Apache Jira for this: > https://issues.apache.org/jira/browse/IGNITE-12155 > Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857 > Currently, the work on this framework is still running but I'm going to > prepare full PR during this week. > > By this email, I want to start a discussion about this idea. > > Best regards, > Alexey Platonov |
I mean metrics for model evaluation like Accuracy or Precision/Recall for
ML models. It isn't same as system metrics (like throughput). Such metrics should be computed over a test set after model training. if it is interesting for you, please, have a look at this material: https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy between machine learning metrics and system metrics. We can't compute ML-metrics via Zabbix for example. Best regards, Alexey Platonov вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>: > Hello, Alexey. > > Why do we need distributed metrics in the first place? > It seems, there are many metric processing system out there: Prometheus, > Zabbix, Splunk, etc. > > Each of then can aggregate metrics in many ways. > > I think, we should not use Ignite as an metrics aggregation system. > > What do you think? > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет: > > Hi Igniters! > > I've been working on a prototype of distributed metrics computation for > > ML-models. Unfortunately, we don't have an ability to compute metrics in > a > > distributed manner, so, it leads to gathering metric statistics to client > > node via ScanQuery and all flow of vectors from partitions will be sent > to > > a client. I want to avoid such behavior and I propose the framework for > > metrics computation using MapReduce approach based on an aggregation of > > statistics for metrics. > > > > I prepared an issue in Apache Jira for this: > > https://issues.apache.org/jira/browse/IGNITE-12155 > > Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857 > > Currently, the work on this framework is still running but I'm going to > > prepare full PR during this week. > > > > By this email, I want to start a discussion about this idea. > > > > Best regards, > > Alexey Platonov > |
Hi, Alexey,
I agree that Map-Reduce on demand looks more promising solution. We can use Compute tasks for implementation. 'Map' phase can be tunned to process data by some trigger (dataset update?) on ContiniousQuery manner and call 'Reduce' (with some cache?) on demand. On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]> wrote: > > I mean metrics for model evaluation like Accuracy or Precision/Recall for > ML models. It isn't same as system metrics (like throughput). Such metrics > should be computed over a test set after model training. if it is > interesting for you, please, have a look at this material: > https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy > between machine learning metrics and system metrics. We can't compute > ML-metrics via Zabbix for example. > > Best regards, > Alexey Platonov > > вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>: > > > Hello, Alexey. > > > > Why do we need distributed metrics in the first place? > > It seems, there are many metric processing system out there: Prometheus, > > Zabbix, Splunk, etc. > > > > Each of then can aggregate metrics in many ways. > > > > I think, we should not use Ignite as an metrics aggregation system. > > > > What do you think? > > > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет: > > > Hi Igniters! > > > I've been working on a prototype of distributed metrics computation for > > > ML-models. Unfortunately, we don't have an ability to compute metrics in > > a > > > distributed manner, so, it leads to gathering metric statistics to client > > > node via ScanQuery and all flow of vectors from partitions will be sent > > to > > > a client. I want to avoid such behavior and I propose the framework for > > > metrics computation using MapReduce approach based on an aggregation of > > > statistics for metrics. > > > > > > I prepared an issue in Apache Jira for this: > > > https://issues.apache.org/jira/browse/IGNITE-12155 > > > Also, I prepared PR for it: https://github.com/apache/ignite/pull/6857 > > > Currently, the work on this framework is still running but I'm going to > > > prepare full PR during this week. > > > > > > By this email, I want to start a discussion about this idea. > > > > > > Best regards, > > > Alexey Platonov > > -- Best Regards, Vyacheslav D. |
Hi, Vyacheslav,
Thanks for the advice. Actually, we already have the MapReduce approach implementation in ML dataset and this implementation is based on compute task. So, I think that I just can to reuse this solution. Best regards, Alexey Platonov вт, 10 сент. 2019 г., 14:27 Vyacheslav Daradur <[hidden email]>: > Hi, Alexey, > > I agree that Map-Reduce on demand looks more promising solution. > We can use Compute tasks for implementation. > 'Map' phase can be tunned to process data by some trigger (dataset > update?) on ContiniousQuery manner and call 'Reduce' (with some > cache?) on demand. > > > On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]> > wrote: > > > > I mean metrics for model evaluation like Accuracy or Precision/Recall for > > ML models. It isn't same as system metrics (like throughput). Such > metrics > > should be computed over a test set after model training. if it is > > interesting for you, please, have a look at this material: > > https://en.wikipedia.org/wiki/Precision_and_recall . It's just homonymy > > between machine learning metrics and system metrics. We can't compute > > ML-metrics via Zabbix for example. > > > > Best regards, > > Alexey Platonov > > > > вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>: > > > > > Hello, Alexey. > > > > > > Why do we need distributed metrics in the first place? > > > It seems, there are many metric processing system out there: > Prometheus, > > > Zabbix, Splunk, etc. > > > > > > Each of then can aggregate metrics in many ways. > > > > > > I think, we should not use Ignite as an metrics aggregation system. > > > > > > What do you think? > > > > > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет: > > > > Hi Igniters! > > > > I've been working on a prototype of distributed metrics computation > for > > > > ML-models. Unfortunately, we don't have an ability to compute > metrics in > > > a > > > > distributed manner, so, it leads to gathering metric statistics to > client > > > > node via ScanQuery and all flow of vectors from partitions will be > sent > > > to > > > > a client. I want to avoid such behavior and I propose the framework > for > > > > metrics computation using MapReduce approach based on an aggregation > of > > > > statistics for metrics. > > > > > > > > I prepared an issue in Apache Jira for this: > > > > https://issues.apache.org/jira/browse/IGNITE-12155 > > > > Also, I prepared PR for it: > https://github.com/apache/ignite/pull/6857 > > > > Currently, the work on this framework is still running but I'm going > to > > > > prepare full PR during this week. > > > > > > > > By this email, I want to start a discussion about this idea. > > > > > > > > Best regards, > > > > Alexey Platonov > > > > > > > -- > Best Regards, Vyacheslav D. > |
Dear Alexey, thank you for your PR, as an author of non-distributed metrics
should say, that it was fast solution to keep parity with Spark ML. I have no time to implement it via our internal MR approach and your pR is really helpful. Dear Nikolay, there is another kind of metrics (not that was mentioned by yourself), but like a metrics to evaluate Machine Learning Algorithms for example like accuracy (how many times the machine predicted correctly) and so on Great PR, I will have a looooook tomorrow вт, 10 сент. 2019 г. в 14:44, Алексей Платонов <[hidden email]>: > Hi, Vyacheslav, > Thanks for the advice. Actually, we already have the MapReduce approach > implementation in ML dataset and this implementation is based on compute > task. So, I think that I just can to reuse this solution. > > Best regards, > Alexey Platonov > > вт, 10 сент. 2019 г., 14:27 Vyacheslav Daradur <[hidden email]>: > > > Hi, Alexey, > > > > I agree that Map-Reduce on demand looks more promising solution. > > We can use Compute tasks for implementation. > > 'Map' phase can be tunned to process data by some trigger (dataset > > update?) on ContiniousQuery manner and call 'Reduce' (with some > > cache?) on demand. > > > > > > On Tue, Sep 10, 2019 at 2:09 PM Алексей Платонов <[hidden email]> > > wrote: > > > > > > I mean metrics for model evaluation like Accuracy or Precision/Recall > for > > > ML models. It isn't same as system metrics (like throughput). Such > > metrics > > > should be computed over a test set after model training. if it is > > > interesting for you, please, have a look at this material: > > > https://en.wikipedia.org/wiki/Precision_and_recall . It's just > homonymy > > > between machine learning metrics and system metrics. We can't compute > > > ML-metrics via Zabbix for example. > > > > > > Best regards, > > > Alexey Platonov > > > > > > вт, 10 сент. 2019 г. в 13:52, Nikolay Izhikov <[hidden email]>: > > > > > > > Hello, Alexey. > > > > > > > > Why do we need distributed metrics in the first place? > > > > It seems, there are many metric processing system out there: > > Prometheus, > > > > Zabbix, Splunk, etc. > > > > > > > > Each of then can aggregate metrics in many ways. > > > > > > > > I think, we should not use Ignite as an metrics aggregation system. > > > > > > > > What do you think? > > > > > > > > В Вт, 10/09/2019 в 13:08 +0300, Алексей Платонов пишет: > > > > > Hi Igniters! > > > > > I've been working on a prototype of distributed metrics computation > > for > > > > > ML-models. Unfortunately, we don't have an ability to compute > > metrics in > > > > a > > > > > distributed manner, so, it leads to gathering metric statistics to > > client > > > > > node via ScanQuery and all flow of vectors from partitions will be > > sent > > > > to > > > > > a client. I want to avoid such behavior and I propose the framework > > for > > > > > metrics computation using MapReduce approach based on an > aggregation > > of > > > > > statistics for metrics. > > > > > > > > > > I prepared an issue in Apache Jira for this: > > > > > https://issues.apache.org/jira/browse/IGNITE-12155 > > > > > Also, I prepared PR for it: > > https://github.com/apache/ignite/pull/6857 > > > > > Currently, the work on this framework is still running but I'm > going > > to > > > > > prepare full PR during this week. > > > > > > > > > > By this email, I want to start a discussion about this idea. > > > > > > > > > > Best regards, > > > > > Alexey Platonov > > > > > > > > > > > > -- > > Best Regards, Vyacheslav D. > > > |
Free forum by Nabble | Edit this page |