On Thu, Mar 4, 2021 at 11:54 AM Nikita Popov <nikita....@gmail.com> wrote:
> On Thu, Mar 4, 2021 at 10:54 AM Christian Schneider <cschn...@cschneid.com> > wrote: > >> Am 04.03.2021 um 01:37 schrieb Ben Ramsey <b...@benramsey.com>: >> > On Mar 3, 2021, at 14:25, Kamil Tekiela <tekiela...@gmail.com> wrote: >> >> >> >> when both are strings then chances are that this is an error. >> > >> > Except when comparing two values from sources known to provide numbers >> as strings, such as form input and database results. :-) >> >> >> This would be a problem for leading zeroes and leading/training spaces, >> right? >> >> Leading zeroes theoretically could happen in databases, leading/training >> spaces happen in form input and possibly databases. >> Are there other 'common' cases? >> > > The main one that comes to mind is something like '0' == '0.0'. However, > the real problem is something else: Comparison behavior doesn't affect just > == and !=, but also < and >. And I can see how people would want '2' < '10' > to be true (numeric comparison) rather than false (lexicographical > comparison). > > I generally agree that we should remove the special "numeric string" > handling for equality comparisons, and I don't think that removing that > behavior would have a major impact. But we do need to carefully consider > the impact it has on relational operators. There are two ways I can see > this going: > > 1. Decouple equality comparison from relational comparison. Don't handle > numeric strings for == and !=, but do handle them for <, >, etc. The > disadvantage is that comparison results may not be trichotomous, e.g. for > "0" op "0.0" all of ==, < and > would return false. (To be fair, this can > already happen in other cases, e.g. non-comparable objects.) > > 2. Don't allow relational comparison on strings. If you want to compare > them lexicographically, use strcmp(), otherwise cast to number first. > ("Don't allow" here could be a warning to start with.) > Regarding the last point, while I think that lexicographical comparison with explicit < and > operators is pretty uncommon, sorting an array of strings and expecting lexicographical order probably isn't unusual. While SORT_STRING can be passed to enforce that, people probably expect that as the default behavior. So just not allowing relational comparison is not a great option either. Nikita