Hi Anatol,

Just a quick reply to couple parts... Have to go, and I don't care much either way about the stuff, just commenting before. :-)

----- Original Message -----
From: "Anatol Belski"
Sent: Tuesday, August 25, 2015

Hi Matt,

Thanks for the comments.

-----Original Message-----
From: Matt Wilmas [mailto:php_li...@realplain.com]
Sent: Tuesday, August 25, 2015 12:06 PM
To: Anatol Belski <anatol....@belski.net>; 'Dmitry Stogov'
<dmi...@php.net>;
'Xinchen Hui' <xinche...@zend.com>; 'Nikita Popov' <nikita....@gmail.com>;
'Pierre Joye' <pierre....@gmail.com>; 'Bob Weinand'
<bobw...@hotmail.com>; 'Jakub Zelenka' <bu...@php.net>
Cc: internals@lists.php.net
Subject: Re: [PHP-DEV] Overflow checks and integral vars comparison

Hi Anatol, Dmitry, all,

----- Original Message -----
From: "Anatol Belski"
Sent: Friday, August 21, 2015

> Hi,
>
> Resending this as missed internals at the start.
>
> I was lately rethinking some part of the 64-bit RFC, and also seeing
> now Jakub's work on catching overflows in ext/openssl and Matt
> Williams suggestions on it (which was going a bit more global over
> it). So I came up with these macros with two goals

Dang, people be gettin' my name wrong, in the same way, even "in writing."
;-P

Ohh, no intention on that, I will save it in my head, Matt Wilmas :)

Haha.

> - standardize the overflow checks
> - do actualy checks only on architectures where it's needed

The compiler will already do that, as I've said of course, and more
importantly
(for most cases), it will never get it wrong.

> - simplify the checks where external libs (openssl, libxml, etc.) use
> firm datatypes like int
>
> #if SIZEOF_INT == SIZEOF_ZEND_LONG
> # define ZEND_LONG_INT_OVFL(zl) (0)
> # define ZEND_LONG_INT_UDFL(zl) (0)
> #else
> # define ZEND_LONG_INT_OVFL(zlong) ((zlong) > (zend_long)INT_MAX) #
> define
> ZEND_LONG_INT_UDFL(zlong) ((zlong) < (zend_long)INT_MIN) #endif

Really no point in all that...

> #define ZEND_SIZE_T_INT_OVFL(size) ((size) > (size_t)INT_MAX)

What's wrong with just doing the check, "as is," instead of hiding it
behind a
[longer] macro?  (Also, none of the casts are necessary...?)

> So having it like
>
> If (ZEND_LONG_INT_OVFL(x)) {
> return;
> }
>
> Compiler would eliminate the branch automatically on 32-bit and ILP64.

Already will. :^)

> Some other macros to do signed/unsigned comparison, these can be
extended.
>
> #define ZEND_SIZE_T_GT_ZEND_LONG(size, zlong) ((zlong) < 0 || (size) >
> (size_t)(zlong)) #define ZEND_SIZE_T_GTE_ZEND_LONG(size, zlong)
> ((zlong) <
> 0
> || (size) >= (size_t)(zlong)) #define ZEND_SIZE_T_LT_ZEND_LONG(size,
> zlong)
> ((zlong) >= 0 && (size) < (size_t)(zlong)) #define
> ZEND_SIZE_T_LTE_ZEND_LONG(size, zlong) ((zlong) >= 0 && (size) <=
> (size_t)(zlong))

Yes, the only case where we might want to do something special is
comparing
an unsigned type, like size_t (string lengths), with a signed max, since
that's not
impossible to the compiler.  Of course, you have to KNOW that's what you
want
and/or it's safe to ignore the check, e.g. not dealing with strings > 2 GB
on 32-
bit, etc.

And again, a generic way to do that for a "max" of integer type or greater
(since
literals are smallest type >= int that will hold the value), that should
work
anywhere, is:

#define ZEND_SIGNED_OVFL_etc(val, max) (sizeof(val) > sizeof(max) && (val)

(max))

Simple and optimizes easily...

In general it's better to not to be optimistic about what compilers will do.

I didn't reply in last week's thread about the overflow checks in OpenSSL...

But it is *definitely* fine to be optimistic and rely on compiler to do basic, basic stuff like this. No reason to make things more complicated just to think one is helping the compiler.

Proof? Have you seen the whole VM in the last, like, 10 years? Using your logic, we have major, major, MAJOR problems literally everywhere there!

Now can anyone point to problems compiling the default, specialized VM? I rest my case. :-) There's conditional stuff ALL over the place (not macros), and propagated to inline functions, etc. where the compiler is "counted on" to remove the constant conditions and all that stuff.

The current FAST_ZPP parameter parsing, any unnecessary parts are optimized out (it'd be bad if that wasn't the case). My new proposal (coming soon...) has a lot of more complicated-looking source code, but almost all deleted by the compiler.

Transforming zpp's type spec string at compile time, which MSVC can't do because it does indeed suck for some reason, involves *dozens* of KB of source, many inline functions, etc. but nearly 100% of it is deleted, correctly, by GCC and Clang.


Anyway, that's why I say if "some" compiler can't do basic optimizations, it *sucks*, and has much bigger problems in the rest of the source, so there's no reason to worry about it... At all.

Also it is probably a question of taste. SIZEOF_INT == SIZEOF_ZEND_LONG is
something with zero calculation and guaranteed to exclude any possible side
effect, while sizeof(int) == sizeof(zend_long) depends on the optimization
choosen. Not disputing that your suggestion will do nearly the same job in
the usual case, ofc.

With the casts - probably yes, many cases will go by the standard type
promotion rules, but having the casts only ensures it does what one actually
intended to do and has no effect in the generated code. For example one
would need a cast when comparing with INT_MAX + 1. Worse is with unsigned,
because the result depends on the platform and concrete types and values.

Hiding behind the macros  makes full sense for two reasons at least

- there are people with various skills that can just rely on what the API
provides without going too deep into the detail
- doing an internal change won't affect anything in the existing usages

> IMHO these and maybe more are missing after the 64-bit RFC. Do you
> think they would make sense? Or would make sense now, or later in
master?

Also, why, exactly, has the zend_string length been defined as size_t
(just
because it's "supposed to" be?), which is incompatible with most
everything in
PHP, instead of zend_long (since we had int before)?

e.g. Just look at strlen() -- oops, size_t no worky! :-/ Dmitry? We need
range
checking there, right?  Conversion to double?  No?  Then there's *no
point* in using size_t, and would let the compiler auto-optimize the
above-
referenced comparisons with size_t...

Not sure what you mean here, 32-bit?

I mean, PHP's strlen() on any platform, why is it possible to have larger strings (size_t length) than can be returned in zend_long. That's a bug.

memory_limit is unsigned 32-bit there,
but even if not - it is hardly possible to have a string exceeding it there
anyway.

Not possible? Then like I said, why use size_t? It would help some of these things to get rid of it.

Otherwise, there is a plenty of libraries working with size_t and it
makes full sense to exhaust that on 64-bit. And the most of PHP7 is just
fine with size_t.

So? Why does it matter if libraries are using size_t? The same libraries always have been and passed an int (from string length) before PHP 7. Pass them zend_long instead now...

Before the 64-bit RFC my idea was actually to port the dependency libs for
the full 64-bit support, but it's insane. So better to fix the side effects,
as those are almost the same as in PHP5 on most of 64-bit.

Regards

Anatol

- Matt

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to