Bobby Wang created SPARK-50974:
----------------------------------

             Summary: CrossValidator: foldCol is not supported
                 Key: SPARK-50974
                 URL: https://issues.apache.org/jira/browse/SPARK-50974
             Project: Spark
          Issue Type: Sub-task
          Components: Connect, ML, PySpark
    Affects Versions: 4.0.0, 4.1
            Reporter: Bobby Wang


error msg:



    cvModel2 = cv_with_user_folds.fit(dataset_with_folds)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/base.py", line 
203, in fit
    return self._fit(dataset)
           ^^^^^^^^^^^^^^^^^^
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", line 
848, in _fit
    datasets = self._kFold(dataset)
               ^^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", line 
906, in _kFold
    training = dataset.filter(checker_udf(dataset[foldCol]) & (col(foldCol) != 
lit(i)))
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 
405, in __call__
    jcols = [_to_java_column(arg) for arg in args] + [
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 
405, in <listcomp>
    jcols = [_to_java_column(arg) for arg in args] + [
             ^^^^^^^^^^^^^^^^^^^^
  File 
"/home/xxx/work.d/spark/spark-master/python/pyspark/sql/classic/column.py", 
line 71, in _to_java_column
    raise PySparkTypeError(
pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR] Argument 
`col` should be a Column or str, got Column.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to