Re: Develop custom Estimator / Transformer for pipeline

Joseph Bradley Thu, 17 Nov 2016 13:16:47 -0800

Hi Georg,

It's true we need better documentation for this.  I'd recommend checking
out simple algorithms within Spark for examples:
ml.feature.Tokenizer
ml.regression.IsotonicRegression

You should not need to put your library in Spark's namespace.  The shared
Params in SPARK-7146 are not necessary to create a custom algorithm; they
are just niceties.

Though there aren't great docs yet, you should be able to follow existing
examples.  And I'd like to add more docs in the future!

Good luck,
Joseph

On Wed, Nov 16, 2016 at 6:29 AM, Georg Heiler <georg.kf.hei...@gmail.com>
wrote:

> HI,
>
> I want to develop a library with custom Estimator / Transformers for
> spark. So far not a lot of documentation could be found but
> http://stackoverflow.com/questions/37270446/how-to-
> roll-a-custom-estimator-in-pyspark-mllib
>
> Suggest that:
> Generally speaking, there is no documentation because as for Spark 1.6 /
> 2.0 most of the related API is not intended to be public. It should change
> in Spark 2.1.0 (see SPARK-7146
> <https://issues.apache.org/jira/browse/SPARK-7146>).
>
> Where can I already find documentation today?
> Is it true that my library would require residing in Sparks`s namespace
> similar to https://github.com/collectivemedia/spark-ext to utilize all
> the handy functionality?
>
> Kind Regards,
> Georg
>

Re: Develop custom Estimator / Transformer for pipeline

Reply via email to