We certainly could, but we lose some speed, especially when
script_encoding == output_encoding (where we don't really need to
transcode HTML blocks). Are we up for that?
-Andrei
On Aug 15, 2005, at 3:03 PM, Andi Gutmans wrote:
Wouldn't it be easiest to have inline html become IS_UNICODE and then
not deal with the problem of remember what the script encoding was? I
thought that's what we already do today.
Andi
At 12:37 PM 8/10/2005 -0700, Andrei Zmievski wrote:
I did not have time to write the full reply earlier so here goes.
Even if we modify the output layer to be aware of various types of
strings coming down the pipe, it would still need to know the
encoding of IS_STRING's in order to convert them to the output
encoding. This presents a particular problem for inline HTML blocks,
as they are supposed to be in the script encoding, but by the time
the HTML is sent to the output layer, we don't know what the source
script encoding was for these HTML blocks. This problem exists in the
current implementation also, because the ZEND_ECHO opcode does not
keep track of what the script encoding was. This needs to be fixed,
obviously.
One approach could be to implement a separate opcode for inline HTML
blocks and store the name of the script encoding it came from in the
opcode. Then when the output layer (or whatever else) gets to it, we
can check the encoding name in the opcode vs. the output encoding and
perform transcoding if necessary. This does mean that we may need to
dynamically open and close converters on each output (if there were
different script encodings floating around), but can be alleviated by
keeping some sort of converter cache around.
I am open to other ideas.
-Andrei
On Aug 10, 2005, at 8:34 AM, Andrei Zmievski wrote:
That's not true, actually. 'echo' and 'print' resolve to ZEND_ECHO
opcode which calls zend_print_variable(), which in turn calls
zend_make_printable_zval(). Now, this last function is supposed to
take a zval and turn it into a printable string, of course, which is
then output using utility_functions->write_function aka
php_body_write(). All that function cares about is how to output a
binary string. So, if we want to bubble the conversion down to the
output layer, we probably need to change the write function so that
it takes a void* and a type and knows how to deal with them
appropriately.
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php