[ 
https://issues.apache.org/jira/browse/SPARK-50974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bobby Wang updated SPARK-50974:
-------------------------------
    Summary: CrossValidator: support foldCol  (was: CrossValidator: foldCol is 
not supported)

> CrossValidator: support foldCol
> -------------------------------
>
>                 Key: SPARK-50974
>                 URL: https://issues.apache.org/jira/browse/SPARK-50974
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Connect, ML, PySpark
>    Affects Versions: 4.0.0, 4.1
>            Reporter: Bobby Wang
>            Priority: Major
>
> error msg:
>     cvModel2 = cv_with_user_folds.fit(dataset_with_folds)
>                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/base.py", line 
> 203, in fit
>     return self._fit(dataset)
>            ^^^^^^^^^^^^^^^^^^
>   File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", 
> line 848, in _fit
>     datasets = self._kFold(dataset)
>                ^^^^^^^^^^^^^^^^^^^^
>   File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", 
> line 906, in _kFold
>     training = dataset.filter(checker_udf(dataset[foldCol]) & (col(foldCol) 
> != lit(i)))
>                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 
> 405, in __call__
>     jcols = [_to_java_column(arg) for arg in args] + [
>             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>   File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 
> 405, in <listcomp>
>     jcols = [_to_java_column(arg) for arg in args] + [
>              ^^^^^^^^^^^^^^^^^^^^
>   File 
> "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/classic/column.py", 
> line 71, in _to_java_column
>     raise PySparkTypeError(
> pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR] Argument 
> `col` should be a Column or str, got Column.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to