Dear Alexey, thank you for your PR, as an author of non-distributed metrics
should say, that it was fast solution to keep parity with Spark ML.
I have no time to implement it via our internal MR approach and your pR is
really helpful.
Dear Nikolay, there is another kind of metrics (not that was me
Hi, Vyacheslav,
Thanks for the advice. Actually, we already have the MapReduce approach
implementation in ML dataset and this implementation is based on compute
task. So, I think that I just can to reuse this solution.
Best regards,
Alexey Platonov
вт, 10 сент. 2019 г., 14:27 Vyacheslav Daradur :
Hi, Alexey,
I agree that Map-Reduce on demand looks more promising solution.
We can use Compute tasks for implementation.
'Map' phase can be tunned to process data by some trigger (dataset
update?) on ContiniousQuery manner and call 'Reduce' (with some
cache?) on demand.
On Tue, Sep 10, 2019 at
I mean metrics for model evaluation like Accuracy or Precision/Recall for
ML models. It isn't same as system metrics (like throughput). Such metrics
should be computed over a test set after model training. if it is
interesting for you, please, have a look at this material:
https://en.wikipedia.org/
Hello, Alexey.
Why do we need distributed metrics in the first place?
It seems, there are many metric processing system out there: Prometheus,
Zabbix, Splunk, etc.
Each of then can aggregate metrics in many ways.
I think, we should not use Ignite as an metrics aggregation system.
What do you t
Hi Igniters!
I've been working on a prototype of distributed metrics computation for
ML-models. Unfortunately, we don't have an ability to compute metrics in a
distributed manner, so, it leads to gathering metric statistics to client
node via ScanQuery and all flow of vectors from partitions will b