On Mon, Dec 16, 2019 at 12:21 PM Claude Pache <claude.pa...@gmail.com> wrote:
> > > > Le 16 déc. 2019 à 10:58, George Peter Banyard <girg...@php.net> a écrit > : > > > > Greetings internals, > > > > I'm here to get internals's opinion about allowing a string offset to be > > replaced with arbitrary strings. > > > > Since forever strings with more then one byte have been truncated > silently > > to one byte. The case with empty string *kinda* was existent but had a > > buggy behaviour (see bug 71572 [1]) and as such has been turned into a > > warning in PHP 7.1.0, and is meant to be promoted to an Error Exception > per > > the Reclassifying engine warnings RFC. [2] > > An illustration of both these cases is available on 3v4l.org [3] > > <https://3v4l.org/O0nEM> > > > > I've got an implementation ready as a pull request on GitHub [4] (which > > still needs some polishing for it to make CI happy and fix the various > > leaks). > > > > However, the question is if this behaviour is desirable. Moreover, the > > silent truncation of strings with more than one byte should be changed to > > the same severity as the empty string case; i.e. an Error Exception. This > > seems reasonable due to the possible loss of data. > > > > What ever the decision is, a BC break is to be expected. As code which > > inadvertently tries to assign multiple bytes to an offset will know have > > all those bytes in the string whereas before only the first one was used > to > > replace the byte at the designated offset. On the other hand the > > introduction of an Error Exception is obviously a BC break but as it > points > > out to some kind of logic error. > > I would assume the impact to be minimal for both of these case. > > > > Any opinions? > > > > Best regards > > > > George Peter Banyard > > > > [1] https://bugs.php.net/bug.php?id=71572 > > [2] https://wiki.php.net/rfc/engine_warnings > > [3] https://3v4l.org/O0nEM > > [4] https://github.com/php/php-src/pull/5013 > > > Being able to replace one byte with an arbitrary string does not seem very > useful to me, unless you add the functionality of replacing an arbitrary > substring with an arbitrary string (an equivalent of array_splice() for > strings). Your implementation gives the following potentially surprising > result: > > <?php > $str = "Hello world"; > var_dump(strlen($str)); // 11 > $str[0] = "あ"; // replace the first byte with three bytes (assuming source > code is UTF-8-encoded) > $str[0] = "H"; // replace the first byte with one byte > var_dump(strlen($str)); // 13 > ?> > > IMO, the only reasonable thing to do at this point is just emitting a > Warning (or an Error). > I agree. As it's currently silently accepted, I'd recommend starting with a warning. Nikita