I think the main issue here is that if your script encoding is set to UTF-8 and you do everything in UTF-8 then these large blocks of UTF-8 are going to make a UTF-8 -> UTF-16 -> UTF-8 conversion roundtrip on every request. It would be nice if we could somehow avoid that.
-Rasmus Andi Gutmans wrote: > Wouldn't it be easiest to have inline html become IS_UNICODE and then > not deal with the problem of remember what the script encoding was? I > thought that's what we already do today. > > Andi > > At 12:37 PM 8/10/2005 -0700, Andrei Zmievski wrote: > >> I did not have time to write the full reply earlier so here goes. >> >> Even if we modify the output layer to be aware of various types of >> strings coming down the pipe, it would still need to know the encoding >> of IS_STRING's in order to convert them to the output encoding. This >> presents a particular problem for inline HTML blocks, as they are >> supposed to be in the script encoding, but by the time the HTML is >> sent to the output layer, we don't know what the source script >> encoding was for these HTML blocks. This problem exists in the current >> implementation also, because the ZEND_ECHO opcode does not keep track >> of what the script encoding was. This needs to be fixed, obviously. >> >> One approach could be to implement a separate opcode for inline HTML >> blocks and store the name of the script encoding it came from in the >> opcode. Then when the output layer (or whatever else) gets to it, we >> can check the encoding name in the opcode vs. the output encoding and >> perform transcoding if necessary. This does mean that we may need to >> dynamically open and close converters on each output (if there were >> different script encodings floating around), but can be alleviated by >> keeping some sort of converter cache around. >> >> I am open to other ideas. >> >> -Andrei >> >> On Aug 10, 2005, at 8:34 AM, Andrei Zmievski wrote: >> >>> That's not true, actually. 'echo' and 'print' resolve to ZEND_ECHO >>> opcode which calls zend_print_variable(), which in turn calls >>> zend_make_printable_zval(). Now, this last function is supposed to >>> take a zval and turn it into a printable string, of course, which is >>> then output using utility_functions->write_function aka >>> php_body_write(). All that function cares about is how to output a >>> binary string. So, if we want to bubble the conversion down to the >>> output layer, we probably need to change the write function so that >>> it takes a void* and a type and knows how to deal with them >>> appropriately. > > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php