That is feasible, the main point is that negative scales were not really meant to be there in the first place, so it something which was forgot to be forbidden, and it is something which the DBs we are drawing our inspiration from for decimals (mainly SQLServer) do not support. Honestly, my opinion on this topic is: - let's add the support to negative scales in the operations (I have already a PR out for that, https://github.com/apache/spark/pull/22450); - let's reduce our usage of DECIMAL in favor of DOUBLE when parsing literals, as done by Hive, Presto, DB2, ...; so the number of cases when we deal with negative scales in anyway small (and we do not have issues with datasources which don't support them).
Thanks, Marco Il giorno mar 18 dic 2018 alle ore 19:08 Reynold Xin <r...@databricks.com> ha scritto: > So why can't we just do validation to fail sources that don't support > negative scale, if it is not supported? This way, we don't need to break > backward compatibility in anyway and it becomes a strict improvement. > > > On Tue, Dec 18, 2018 at 8:43 AM, Marco Gaido <marcogaid...@gmail.com> > wrote: > >> This is at analysis time. >> >> On Tue, 18 Dec 2018, 17:32 Reynold Xin <r...@databricks.com wrote: >> >>> Is this an analysis time thing or a runtime thing? >>> >>> On Tue, Dec 18, 2018 at 7:45 AM Marco Gaido <marcogaid...@gmail.com> >>> wrote: >>> >>>> Hi all, >>>> >>>> as you may remember, there was a design doc to support operations >>>> involving decimals with negative scales. After the discussion in the design >>>> doc, now the related PR is blocked because for 3.0 we have another option >>>> which we can explore, ie. forbidding negative scales. This is probably a >>>> cleaner solution, as most likely we didn't want negative scales, but it is >>>> a breaking change: so we wanted to check the opinion of the community. >>>> >>>> Getting to the topic, here there are the 2 options: >>>> * - Forbidding negative scales* >>>> Pros: many sources do not support negative scales (so they can create >>>> issues); they were something which was not considered as possible in the >>>> initial implementation, so we get to a more stable situation. >>>> Cons: some operations which were supported earlier, won't be working >>>> anymore. Eg. since our max precision is 38, if the scale cannot be negative >>>> 1e36 * 1e36 would cause an overflow, while now works fine (producing a >>>> decimal with negative scale); basically impossible to create a config which >>>> controls the behavior. >>>> >>>> *- Handling negative scales in operations* >>>> Pros: no regressions; we support all the operations we supported on >>>> 2.x. >>>> Cons: negative scales can cause issues in other moments, eg. when >>>> saving to a data source which doesn't support them. >>>> >>>> Looking forward to hear your thoughts, >>>> Thanks. >>>> Marco >>>> >>> >