andygrove commented on code in PR #351:
URL: https://github.com/apache/datafusion-comet/pull/351#discussion_r1585180079
##########
spark/src/test/scala/org/apache/comet/CometCastSuite.scala:
##########
@@ -176,28 +697,38 @@ class CometCastSuite extends CometTestBase with
AdaptiveSparkPlanHelper {
// cast() should throw exception on invalid inputs when ansi mode is
enabled
val df = data.withColumn("converted", col("a").cast(toType))
- val (expected, actual) = checkSparkThrows(df)
-
- if (CometSparkSessionExtensions.isSpark34Plus) {
- // We have to workaround
https://github.com/apache/datafusion-comet/issues/293 here by
- // removing the "Execution error: " error message prefix that is
added by DataFusion
- val cometMessage = actual.getMessage
- .substring("Execution error: ".length)
-
- assert(expected.getMessage == cometMessage)
- } else {
- // Spark 3.2 and 3.3 have a different error message format so we
can't do a direct
- // comparison between Spark and Comet.
- // Spark message is in format `invalid input syntax for type TYPE:
VALUE`
- // Comet message is in format `The value 'VALUE' of the type
FROM_TYPE cannot be cast to TO_TYPE`
- // We just check that the comet message contains the same invalid
value as the Spark message
- val sparkInvalidValue =
- expected.getMessage.substring(expected.getMessage.indexOf(':') + 2)
- assert(actual.getMessage.contains(sparkInvalidValue))
+ checkSparkMaybeThrows(df) match {
+ case (None, None) =>
+ // neither system threw an exception
+ case (None, Some(e)) =>
+ // Spark succeeded but Comet failed
+ throw e
+ case (Some(e), None) =>
+ // Spark failed but Comet succeeded
+ fail(s"Comet should have failed with ${e.getCause.getMessage}")
+ case (Some(sparkException), Some(cometException)) =>
+ // both systems threw an exception so we make sure they are the
same
+ val sparkMessage = sparkException.getCause.getMessage
+ // We have to workaround
https://github.com/apache/datafusion-comet/issues/293 here by
+ // removing the "Execution error: " error message prefix that is
added by DataFusion
+ val cometMessage = cometException.getCause.getMessage
+ .replace("Execution error: ", "")
+ if (CometSparkSessionExtensions.isSpark34Plus) {
+ assert(cometMessage == sparkMessage)
+ } else {
+ // Spark 3.2 and 3.3 have a different error message format so we
can't do a direct
+ // comparison between Spark and Comet.
+ // Spark message is in format `invalid input syntax for type
TYPE: VALUE`
+ // Comet message is in format `The value 'VALUE' of the type
FROM_TYPE cannot be cast to TO_TYPE`
+ // We just check that the comet message contains the same
invalid value as the Spark message
+ val sparkInvalidValue =
sparkMessage.substring(sparkMessage.indexOf(':') + 2)
+ assert(cometMessage.contains(sparkInvalidValue))
+ }
}
// try_cast() should always return null for invalid inputs
- val df2 = spark.sql(s"select try_cast(a as ${toType.sql}) from t")
+ val df2 =
+ spark.sql(s"select a, try_cast(a as ${toType.sql}) from t order by
a")
Review Comment:
If the input dataset has a mix of valid, invalid, and null values, we shoul
expect to see try_cast produce null for the rows that are null or invalid. The
test is checking that we produce the same results as Spark, so I think we have
everything covered, but maybe I am missing something?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]