Greetings,

I have noticed a lot of recent comments, posts, and even Nikita's recent PHP Russia video discussing scalar objects, a potential future feature that I believe already has widespread support, and would have widespread usage once it arrived.

I think most scalars would be self explanatory, but spurred on by discussion on here and other places about string functions, I would like to debate the string object in particular, and specifically the use of encoding in combination with such a scalar object.

I see two options available:



## Option 1

Every String() scalar-object would expose methods for standard byte-safe strings, ascii and multibyte functions, this may result in something like:

"Hello".substr(1)
"Hello".mbSubstr(1)



## Option 2

Allow the string to be bound to a specific encoding which would require _zend_string to be extended with a pointer to a structure containing encoding helpers. All of the php-src macros would need updating to take these into account.

The scalar object methods would then use that to detect which implementation to use.

"Hello".substr(1) // would work as expected regardless of encoding

My question to everyone is, what mechanism would be used to mark a string as being of a specific encoding? Naturally a .toUTF8() would be possible, but I'm not sure that would be as tidy as it could be.

"Hello".toUTF8()
$_GET['example'].toUTF8()

In certain languages, a basic string can be prefixed with L to treat it as a 16 bit wide character, particularly useful for Windows API calls. Perhaps that could be the way to go for interned strings in the code itself?

L"Hello"
L$_GET['example']

But most strings we use will be coming from an external source, such as user input or a database, what would be the cleanest way to mark them as having a specific encoding?

Perhaps going a bit more out-there, would this perhaps bring about the necessity of a adding a specific encoded native type for at least the defacto encoding for the web?

function x(utf8_string $x): utf8_string { ... }

These are all just questions I have no answer to or firm opinion on, but I would be interested to know people's general ideas as to solutions.



--
Mark Randall

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to