Re: Druid and machine learning

Driesprong, Fokko Mon, 27 Jan 2020 00:11:32 -0800

> Vertica has it. Good idea to introduce it in Druid.

I'm not sure if this is a valid argument. With this argument, you can
introduce anything into Druid. I think it is good to be opinionated, and as
a community why we do or don't introduce ML possibilities into the software.

For example, databases like Postgres and Bigquery allow users to do simple
regression models:
https://cloud.google.com/bigquery-ml/docs/bigqueryml-intro. I also don't
think it isn't that hard to introduce linear regression using gradient
decent into Druid:
https://spin.atomicobject.com/2014/06/24/gradient-descent-linear-regression/
However,
how many people are going to use this?

For me, it makes more sense to have tooling around Druid, to do slice and
dice the data that you need, and do the ml stuff in sklearn, or even in
spark. For example using https://github.com/druid-io/pydruid or having the
ability to use Spark to read directly from the deep storage.

Introducing models using SP or UDF's is also a possibility, but here I
share the concerns of Sayat when it comes to performance and scalability.

Cheers, Fokko

Op za 25 jan. 2020 om 08:51 schreef Gaurav Bhatnagar <gaura...@gmail.com>:

> +1
>
> Vertica has it. Good idea to introduce it in Druid.
>
> On Mon, Jan 13, 2020 at 12:52 AM Dusan Maric <thema...@gmail.com> wrote:
>
> > +1
> >
> > That would be a great idea! Thanks for sharing this.
> >
> > Would just like to chime in on Druid + ML model cases: predictions and
> > anomaly detection on top of TensorFlow ❤
> >
> > Regards,
> >
> > On Fri, Jan 10, 2020 at 6:41 AM Roman Leventov <leventov...@gmail.com>
> > wrote:
> >
> > > Hello Druid developers, what do you think about the future of Druid &
> > > machine learning?
> > >
> > > Druid has been great at complex aggregations. Could (should?) It make
> > > inroads into ML? Perhaps aggregators which apply the rows against some
> > > pre-trained model and summarize results.
> > >
> > > Should model training stay completely external to Druid, or it could be
> > > incorporated into Druid's data lifecycle on a conceptual level, such
> as a
> > > recurring "indexing" task which stores the result (the model) in
> Druid's
> > > deep storage, the model automatically loaded on historical nodes as
> > needed
> > > (just like segments) and certain aggregators pick up the latest model?
> > >
> > > Does this make any sense? In what cases Druid & ML will and will not
> work
> > > well together, and ML should stay a Spark's prerogative?
> > >
> > > I would be very interested to hear any thoughts on the topic, vague
> ideas
> > > and questions.
> > >
> >
> >
> > --
> > Dušan Marić
> > mob.: +381 64 1124779 | e-mail: thema...@gmail.com | skype: themaric
> >
>

Re: Druid and machine learning

Reply via email to