Hi Yu, Reducing the code complexity on the Python side is certainly what we want to see:) We didn't call Java directly in Python models because Java methods don't work inside RDD closures, e.g.,
rdd.map(lambda x: model.predict(x[1])) But I agree that for model save/load the implementation should be simplified. Could you submit a PR and see how much code we can save? Thanks, Xiangrui On Wed, Jun 17, 2015 at 8:15 PM, Yu Ishikawa <yuu.ishikawa+sp...@gmail.com> wrote: > Hi all, > > I think we should refactor some machine learning model classes in Python to > reduce the software maintainability. > Inheriting JavaModelWrapper class, we can easily and directly call Scala API > for the model without PythonMLlibAPI. > > In some case, a machine learning model class in Python has complicated > variables. That is, it is a little hard to implement import/export methods > and it is also a little troublesome to implement the function in both of > Scala and Python. And I think standardizing how to create a model class in > python is important. > > What do you think about that? > > Thanks, > Yu > > > > ----- > -- Yu Ishikawa > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781.html > Sent from the Apache Spark Developers List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org > For additional commands, e-mail: dev-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org