My $0.02 -- this isn't worthwhile. Yes, there are ML-in-SQL tools. I'm thinking of MADlib for example. I think these hold over from days when someone's only interface to a data warehouse was SQL, and so there had to be SQL-language support for invoking ML jobs. There was no programmatic alternative.
There's nothing particularly helpful about SQL as a language for expressing this, versus simply writing operations in a high-level programming language. Spark is that programmatic paradigm, and offers a more general way to express ETL, ML and SQL within their own appropriate DSLs. There's no need to also shoehorn Spark ML into Spark SQL. I also think there's a bit of false abstraction here. The nice thing about SQL-only access to these functions is it sounds much simpler, and accessible to people that only know SQL and nothing about Python or JVMs. In practice, using Spark means having some basic awareness of its distributed execution environment. SQL-only analysts would struggle to be effective with SQL-only access to Spark. On Fri, Aug 31, 2018 at 5:05 AM Hemant Bhanawat <hemant9...@gmail.com> wrote: > We allow our users to interact with spark cluster using SQL queries only. > That's easy for them. MLLib does not have SQL extensions and we cannot > expose it to our users. > > SQL extensions can further accelerate MLLib's adoption. See > https://cloud.google.com/bigquery/docs/bigqueryml-intro. > > Hemant > > > On Thu, Aug 30, 2018 at 9:41 PM William Benton <wi...@redhat.com> wrote: > >> What are you interested in accomplishing? >> >> The spark.ml package has provided a machine learning API based on >> DataFrames for quite some time. If you are interested in mixing query >> processing and machine learning, this is certainly the best place to start. >> >> See here: https://spark.apache.org/docs/latest/ml-guide.html >> >> >> best, >> wb >> >> >> >> On Thu, Aug 30, 2018 at 1:45 AM Hemant Bhanawat <hemant9...@gmail.com> >> wrote: >> >>> Is there a plan to support SQL extensions for mllib? Or is there an >>> effort already underway? >>> >>> Any information is appreciated. >>> >>> Thanks in advance. >>> Hemant >>> >>