phakawatfong opened a new pull request, #50205: URL: https://github.com/apache/spark/pull/50205
when I parse ignoreColumnType=True when call the function, it cast all values to be 'str' which causing the comparison between val1 and val2 to be failed. for example val1 = 1505.76189560 val2= 1505.761896 assertDataFrameEqual(source_df, databricks_df, ignoreColumnType=True,) I will convert the original data into string with this clause first ``` if ignoreColumnType: actual = cast_columns_to_string(actual) expected = cast_columns_to_string(expected) . . compare_vals(val1, val2) ``` it will become string comparison instead of compare by using the tolerance equation '1505.76189560' != '1505.761896' Hence assertError, even if the atol and rtol already assigned ### What changes were proposed in this pull request? add more elif case to convert str to float and parse into tolerant calculations ### Why are the changes needed? I think it might not be the only way to fix it, but just to be your ideas Thank you ! ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test on my local data source, comparing 2 dataframes by patch the code into my local env. ### Was this patch authored or co-authored using generative AI tooling? No -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org