phakawatfong opened a new pull request, #50205:
URL: https://github.com/apache/spark/pull/50205

   when I parse ignoreColumnType=True when call the function, it cast all 
values to be 'str' which causing the comparison between val1 and val2 to be 
failed.
   for example
   
   val1 = 1505.76189560
   val2= 1505.761896
   assertDataFrameEqual(source_df, databricks_df, ignoreColumnType=True,)
   
   I will convert the original data into string with this clause first
   ```
   if ignoreColumnType:
       actual = cast_columns_to_string(actual)
       expected = cast_columns_to_string(expected)
   .
   .
   compare_vals(val1, val2)
   ```
   it will become string comparison instead of compare by using the tolerance 
equation
   '1505.76189560' != '1505.761896'
   
   Hence assertError, even if the atol and rtol already assigned
   
   
   
   ### What changes were proposed in this pull request?
   add more elif case to convert str to float and parse into tolerant 
calculations
   
   ### Why are the changes needed?
   I think it might not be the only way to fix it, but just to be your ideas 
Thank you !
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Test on my local data source, comparing 2 dataframes
   by patch the code into my local env.
   
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to