Bobby Wang created SPARK-50974: ---------------------------------- Summary: CrossValidator: foldCol is not supported Key: SPARK-50974 URL: https://issues.apache.org/jira/browse/SPARK-50974 Project: Spark Issue Type: Sub-task Components: Connect, ML, PySpark Affects Versions: 4.0.0, 4.1 Reporter: Bobby Wang
error msg: cvModel2 = cv_with_user_folds.fit(dataset_with_folds) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/base.py", line 203, in fit return self._fit(dataset) ^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", line 848, in _fit datasets = self._kFold(dataset) ^^^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/ml/tuning.py", line 906, in _kFold training = dataset.filter(checker_udf(dataset[foldCol]) & (col(foldCol) != lit(i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 405, in __call__ jcols = [_to_java_column(arg) for arg in args] + [ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/udf.py", line 405, in <listcomp> jcols = [_to_java_column(arg) for arg in args] + [ ^^^^^^^^^^^^^^^^^^^^ File "/home/xxx/work.d/spark/spark-master/python/pyspark/sql/classic/column.py", line 71, in _to_java_column raise PySparkTypeError( pyspark.errors.exceptions.base.PySparkTypeError: [NOT_COLUMN_OR_STR] Argument `col` should be a Column or str, got Column. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org