Hi all, I am writing this e-mail in order to discuss the issue which is reported in SPARK-25454 and according to Wenchen's suggestion I prepared a design doc for it.
The problem we are facing here is that our rules for decimals operations are taken from Hive and MS SQL server and they explicitly don't support decimals with negative scales. So the rules we have currently are not meant to deal with negative scales. The issue is that Spark, instead, doesn't forbid negative scales and - indeed - there are cases in which we are producing them (eg. a SQL constant like 1e8 would be turned to a decimal(1, -8)). Having negative scales most likely wasn't really intended. But unfortunately getting rid of them would be a breaking change as many operations working fine currently would not be allowed anymore and would overflow (eg. select 1e36 * 10000). As such, this is something I'd definitely agree on doing, but I think we can target only for 3.0. What we can start doing now, instead, is updating our rules in order to handle properly also the case when decimal scales are negative. From my investigation, it turns out that the only operations which has problems with them is Divide. Here you can find the design doc with all the details: https://docs.google.com/document/d/17ScbMXJ83bO9lx8hB_jeJCSryhT9O_HDEcixDq0qmPk/edit?usp=sharing. The document is also linked in SPARK-25454. There is also already a PR with the change: https://github.com/apache/spark/pull/22450. Looking forward to hear your feedback, Thanks. Marco