On Fri, 13 Apr 2012 18:09:24 +0200, Stas Malyshev <smalys...@sugarcrm.com>
wrote:
There are other situations where the result of the comparison may be
"inaccurate" -- in the sense that two strings may be constructed as
representing different numbers, but they compare equal.
* Comparing two different real numbers that map to the same double
precision number:
var_dump("1.9999999999999999" == "2"); //true
For floats, there's no accurate comparison anyway, it is a known fact.
However, you are not comparing floats, you're comparing strings. As I
showed, floats in strings are already treated differently depending on
whether they're in string form or not (1e400 == 1e400 vs "1e400" ==
"1e400"). What's under discussion is once again whether to treat
distinctly a proper integer from a integer in string form.
[...]
However, taking the last case an example, this is the same that happens
if
you compare:
var_dump((int)"9223372036854775807" == (double)"9223372036854775808");
//true
This, however is a different case since you explicitly coerce the types
and you must know that both conversions are lossy. It's like doing
substr($a, 0, 1) == substr($b, 0, 1) - of course it can return true even
if $a and $b different. When you convert bigger type (string) to smaller
type (int) you must accept the potential loss or check for it if it's
important.
However I think it would make sense not to use this conversion in string
comparisons when we know it's lossy - it seems to be outside of the use
case for such comparisons and it seems apparent by now that it is hard
for people to understand why it works this way.
First, I don't think this discussion gets any clearer by using ambiguous
terms such as "lossy" and saying "lossy is bad". Is (int) " 02" a lossy
conversion -- you lose the space and 0? What about even (float) "1" -- 1.
is mapped from a infinite number of real numbers due to rounding error and
you have no way to know which one was the original? And in case, I don't
think you mean that (int)"9223372036854775807" is a lossy conversion as it
results in 9223372036854775807 (depending on the width of long, of course).
(by the way, these are rhetorical questions, I don't care about
establishing a definition of "lossy" in this thread)
In any case, your selective quoting destroyed the main point of my e-mail
-- that is, this problem implicates these questions: is
"9223372036854775808" different from 9223372036854775808? Is
"9223372036854775808" still deemed to represent an integer, even though we
cannot represent it as an integer type?
I think most people can agree that this behavior is correct:
var_dump(9223372036854775807 == 9223372036854775808); //true
therefore, we need some -- principled -- distinction to treat case
"9223372036854775807" == "9223372036854775808" differently. The
distinction I propose is answering "yes" to the questions above -- they
represent different entities and when no conversion of the integer string
to the integer type can't be done we should fall back to memcmp(). This is
what is already done with the overflowing "1e400". I don't find it
particularly convincing, though.
--
Gustavo Lopes
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php