The following Databricks blog on Model Persistence states "Internally, we
save the model metadata and parameters as JSON and the data as Parquet."

https://databricks.com/blog/2016/05/31/apache-spark-2-0-preview-machine-learning-model-persistence.html


What data associated with a model or Pipeline is actually saved (in Parquet
format)?

What factors determine how large the the saved model or pipeline will be?

Thanks.
Rich

Reply via email to