I don't think we've given a lot of thought to model persistence for custom Python models yet - if the Python models is wrapping a JVM model using the JavaMLWritable along with '_to_java' should work provided your Java model alread is saveable. On the other hand - if your model isn't wrapping a Java model you shouldn't feel the need to shoehorn yourself into this approach - in either case much of the persistence work is up to you it's just a matter if you do it in the JVM or Python.
On Friday, August 19, 2016, Nicholas Chammas <nicholas.cham...@gmail.com> wrote: > I understand persistence for PySpark ML pipelines is already present in > 2.0, and further improvements are being made for 2.1 (e.g. SPARK-13786 > <https://issues.apache.org/jira/browse/SPARK-13786>). > > I’m having trouble, though, persisting a pipeline that includes a custom > Transformer (see SPARK-17025 > <https://issues.apache.org/jira/browse/SPARK-17025>). It appears that > there is a magic _to_java() method that I need to implement. > > Is the intention that developers implementing custom Transformers would > also specify how it should be persisted, or are there ideas about how to > make this automatic? I searched on JIRA but I’m not sure if I missed an > issue that already addresses this problem. > > Nick > > -- Cell : 425-233-8271 Twitter: https://twitter.com/holdenkarau