On Thu, Mar 4, 2021 at 11:54 AM Nikita Popov <nikita....@gmail.com> wrote:

> On Thu, Mar 4, 2021 at 10:54 AM Christian Schneider <cschn...@cschneid.com>
> wrote:
>
>> Am 04.03.2021 um 01:37 schrieb Ben Ramsey <b...@benramsey.com>:
>> > On Mar 3, 2021, at 14:25, Kamil Tekiela <tekiela...@gmail.com> wrote:
>> >>
>> >> when both are strings then chances are that this is an error.
>> >
>> > Except when comparing two values from sources known to provide numbers
>> as strings, such as form input and database results. :-)
>>
>>
>> This would be a problem for leading zeroes and leading/training spaces,
>> right?
>>
>> Leading zeroes theoretically could happen in databases, leading/training
>> spaces happen in form input and possibly databases.
>> Are there other 'common' cases?
>>
>
> The main one that comes to mind is something like '0' == '0.0'. However,
> the real problem is something else: Comparison behavior doesn't affect just
> == and !=, but also < and >. And I can see how people would want '2' < '10'
> to be true (numeric comparison) rather than false (lexicographical
> comparison).
>
> I generally agree that we should remove the special "numeric string"
> handling for equality comparisons, and I don't think that removing that
> behavior would have a major impact. But we do need to carefully consider
> the impact it has on relational operators. There are two ways I can see
> this going:
>
> 1. Decouple equality comparison from relational comparison. Don't handle
> numeric strings for == and !=, but do handle them for <, >, etc. The
> disadvantage is that comparison results may not be trichotomous, e.g. for
> "0" op "0.0" all of ==, < and > would return false. (To be fair, this can
> already happen in other cases, e.g. non-comparable objects.)
>
> 2. Don't allow relational comparison on strings. If you want to compare
> them lexicographically, use strcmp(), otherwise cast to number first.
> ("Don't allow" here could be a warning to start with.)
>

Regarding the last point, while I think that lexicographical comparison
with explicit < and > operators is pretty uncommon, sorting an array of
strings and expecting lexicographical order probably isn't unusual. While
SORT_STRING can be passed to enforce that, people probably expect that as
the default behavior. So just not allowing relational comparison is not a
great option either.

Nikita

Reply via email to