We certainly could, but we lose some speed, especially when script_encoding == output_encoding (where we don't really need to transcode HTML blocks). Are we up for that?

-Andrei

On Aug 15, 2005, at 3:03 PM, Andi Gutmans wrote:

Wouldn't it be easiest to have inline html become IS_UNICODE and then not deal with the problem of remember what the script encoding was? I thought that's what we already do today.

Andi

At 12:37 PM 8/10/2005 -0700, Andrei Zmievski wrote:
I did not have time to write the full reply earlier so here goes.

Even if we modify the output layer to be aware of various types of strings coming down the pipe, it would still need to know the encoding of IS_STRING's in order to convert them to the output encoding. This presents a particular problem for inline HTML blocks, as they are supposed to be in the script encoding, but by the time the HTML is sent to the output layer, we don't know what the source script encoding was for these HTML blocks. This problem exists in the current implementation also, because the ZEND_ECHO opcode does not keep track of what the script encoding was. This needs to be fixed, obviously.

One approach could be to implement a separate opcode for inline HTML blocks and store the name of the script encoding it came from in the opcode. Then when the output layer (or whatever else) gets to it, we can check the encoding name in the opcode vs. the output encoding and perform transcoding if necessary. This does mean that we may need to dynamically open and close converters on each output (if there were different script encodings floating around), but can be alleviated by keeping some sort of converter cache around.

I am open to other ideas.

-Andrei

On Aug 10, 2005, at 8:34 AM, Andrei Zmievski wrote:

That's not true, actually. 'echo' and 'print' resolve to ZEND_ECHO opcode which calls zend_print_variable(), which in turn calls zend_make_printable_zval(). Now, this last function is supposed to take a zval and turn it into a printable string, of course, which is then output using utility_functions->write_function aka php_body_write(). All that function cares about is how to output a binary string. So, if we want to bubble the conversion down to the output layer, we probably need to change the write function so that it takes a void* and a type and knows how to deal with them appropriately.

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to