I agree, in broad strokes at least. Interested to hear others’ positions.
> On 2 Oct 2018, at 16:44, Ariel Weisberg <ar...@weisberg.ws> wrote: > > Hi, > > I think overflow and the role of widening conversions are pretty linked so > I'll continue to inject that into this discussion. Also overflow is much > worse since most applications won't be impacted by a loss of precision when > an expression involves an int and float, but will care quite a bit if they > get some nonsense wrapped number in an integer only expression. > > For VoltDB in practice we didn't run into issues with applications not making > progress due to exceptions with real data due to the widening conversions. > The range of double and long are pretty big and that hides wrap > around/infinity. > > I think the proposal of having all operations return a decimal is attractive > in that these expressions always result in a consistent type. Two pain points > might be whether client languages have decimal support and whether there is a > performance issue? The nice thing about always returning decimal is we can > sidestep the issue of overflow. > > I would start with seeing if that's acceptable, and if it isn't then look at > other approaches like returning a variety of types such when doing int + int > return a bigint or int + float return a double. > > If we take an approach that allows overflow the ideal end state IMO would be > to get all users to run Cassandra in way that overflow results in an error > even in the context of aggregation. The road to get there is tricky, but > maybe start by having it as an opt in tunable in cassandra.yaml. I don't know > how/when we could ever change that as a default and it's unfortunate having > an option like this that 99% won't know they should flip. > > It seems like having the default throw on overflow is not as bad as it sounds > if you do the widening conversions since most people won't run into them. The > change in the column types of results sets actually sounds worse if we want > to also improve aggregrations. Many applications won't notice if the client > library abstracts that away, but I think there are still cases where people > would notice the type changing. > > Ariel > >> On Tue, Oct 2, 2018, at 11:09 AM, Benedict Elliott Smith wrote: >> This (overflow) is an excellent point, but this also affects >> aggregations which were introduced a long time ago. They already >> inherit Java semantics for all of the relevant types (silent wrap >> around). We probably want to be consistent, meaning either changing >> aggregations (which incurs a cost for changing API) or continuing the >> java semantics here. >> >> This is why having these discussions explicitly in the community before >> a release is so critical, in my view. It’s very easy for these semantic >> changes to go unnoticed on a JIRA, and then ossify. >> >> >>> On 2 Oct 2018, at 15:48, Ariel Weisberg <ar...@weisberg.ws> wrote: >>> >>> Hi, >>> >>> I think we should decide based on what is least surprising as you mention, >>> but isn't overridden by some other concern. >>> >>> It seems to me the priorities are >>> >>> * Correctness >>> * Performance >>> * User visible complexity >>> * Developer visible complexity >>> >>> Defaulting to silent implicit data loss is not ideal from a correctness >>> standpoint. >>> >>> Doing something better like using wider types doesn't seem like a >>> performance issue. >>> >>> From a user standpoint doing something less lossy doesn't look more complex >>> as long as it's consistent, and documented and doesn't change from version >>> to version. >>> >>> There is some developer complexity, but this is a public API and we only >>> get one shot at this. >>> >>> I wonder about how overflow is handled as well. In VoltDB I think we threw >>> on overflow and tended to just do widening conversions to make that less >>> common. We didn't imitate another database (as far as I know) we just went >>> with what least likely to silently corrupt data. >>> https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213 >>> <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L2213> >>> https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764 >>> <https://github.com/VoltDB/voltdb/blob/master/src/ee/common/NValue.hpp#L3764> >>> >>> Ariel >>> >>>> On Tue, Oct 2, 2018, at 7:30 AM, Benedict Elliott Smith wrote: >>>> ç introduced arithmetic operators, and alongside these >>>> came implicit casts for their operands. There is a semantic decision to >>>> be made, and I think the project would do well to explicitly raise this >>>> kind of question for wider input before release, since the project is >>>> bound by them forever more. >>>> >>>> In this case, the choice is between lossy and lossless casts for >>>> operations involving integers and floating point numbers. In essence, >>>> should: >>>> >>>> (1) float + int = float, double + bigint = double; or >>>> (2) float + int = double, double + bigint = decimal; or >>>> (3) float + int = decimal, double + bigint = decimal >>>> >>>> Option 1 performs a lossy implicit cast from int -> float, or bigint -> >>>> double. Simply casting between these types changes the value. This is >>>> what MS SQL Server does. >>>> Options 2 and 3 cast without loss of precision, and 3 (or thereabouts) >>>> is what PostgreSQL does. >>>> >>>> The question I’m interested in is not just which is the right decision, >>>> but how the right decision should be arrived at. My view is that we >>>> should primarily aim for least surprise to the user, but I’m keen to >>>> hear from others. >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >>>> <mailto:dev-unsubscr...@cassandra.apache.org> >>>> For additional commands, e-mail: dev-h...@cassandra.apache.org >>>> <mailto:dev-h...@cassandra.apache.org> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org >>> <mailto:dev-unsubscr...@cassandra.apache.org> >>> For additional commands, e-mail: dev-h...@cassandra.apache.org >>> <mailto:dev-h...@cassandra.apache.org> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org