Re: [PHP-DEV] What should we do with utf8_encode and utf8_decode?

Hans Henrik Bergan Mon, 22 Mar 2021 01:29:12 -0700

i would prefer to soft-deprecate them like we did with the mysql_ api,
where they do not generate E_DEPRECATED for quite some time, but the
documentation say
"this function is deprecated, instead use mb_convert_encoding ( $str ,
"UTF-8", "ISO-8859-1" );  or iconv("ISO-8859-1","UTF-8", $str)"
and.. make it go E_DEPRECATED in the distant future..



Rowan said "they are commonly used, both correctly and
incorrectly", in my experience, no it's not used correctly, people who are
using it, are using it incorrectly to convert Windows-1252 to utf-8, not
ISO-8859-1...



On Mon, 22 Mar 2021 at 02:15, Sara Golemon <poll...@php.net> wrote:

> On Sun, Mar 21, 2021 at 9:18 AM Rowan Tommins <rowan.coll...@gmail.com>
> wrote:
>
> > A) Raise a deprecation notice in 8.1, and remove in 9.0. Do not provide
> > a specific replacement, but recommend people look at iconv() or
> > mb_convert_encoding(). There is precedent for this, such as
> > convert_cyr_string(), but it may frustrate those who are using the
> > functions correctly.
> >
> > B) Introduce new names, such as utf8_to_iso_8859_1 and
> > iso_8859_1_to_utf8; immediately make those the primary names in the
> > manual, with utf8_encode / utf8_decode as aliases. Raise deprecation
> > notices for the old names, either immediately or in some future release.
> > This gives a smoother upgrade path, but commits us to having these
> > functions as outliers in our standard library.
> >
> > C) Leave them alone forever. Treat it as the user's fault if they mess
> > things up by misunderstanding them.
> >
> >
> My preference is for a deprecation notice (but not necessarily removal ever
> -- We can argue that part a little).
>
> As for what users should use instead, obviously there are multiple options
> already in core (which you referenced), but those all have third party deps
> and can't be guaranteed the way utf8_en/decode() can (this was the point of
> moving them from xml).
>
> While I'm normally in favor of userspace things belonging in userspace
> (this particular conversion is trivial since it's a 1:1 mapping), I'm
> actually willing to see this added under a new, clearer name in
> ext/standard since this is something that's in long use, but used
> incorrectly.
>
> As for details, I don't love iso_8859_1_to_utf8(), but we can use the
> common alias for iso-8859-1 known as latin1 and call the new functions:
> utf8_from_latin1() and utf8_to_latin1() with the caveat that the later will
> throw a ValueError for codepoints which are out of range (one of the more
> problematic issues with utf8_decode()).  That makes this not just a simple
> rename for clarity, but what I'd consider a bug-fix for an unfortunately
> unfixable function.
>
> -Sara
>

Re: [PHP-DEV] What should we do with utf8_encode and utf8_decode?

Reply via email to