In 1.4, we added RAND as a DataFrame expression, which can be used for
random split. Please check the example here:
https://github.com/apache/spark/blob/master/python/pyspark/ml/tuning.py#L214.
-Xiangrui

On Thu, May 7, 2015 at 8:39 AM, Olivier Girardot
<[email protected]> wrote:
> Hi,
> is there any best practice to do like in MLLib a randomSplit of
> training/cross-validation set with dataframes and the pipeline API ?
>
> Regards
>
> Olivier.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to