ID:               46737
 Updated by:       [EMAIL PROTECTED]
 Reported By:      sites at hubmed dot org
-Status:           Open
+Status:           Bogus
 Bug Type:         SimpleXML related
 Operating System: Mac OS X
 PHP Version:      5.2.6
 New Comment:

This
<?xml version="1.0"?>
<text>umlaut &#xFC; here</text>
is UTF-8, too. Just written differently. Nothing wrong with the output

(otherwise it would be invalid, as the default encoding is UTF-8, if 
there's nothing declared in the <?xml header.



Previous Comments:
------------------------------------------------------------------------

[2008-12-03 11:41:07] sites at hubmed dot org

Description:
------------
Using $xml->asXML() to output an XML document as a string from
SimpleXML seems to be defaulting to ISO 8859-1 rather than UTF-8,
despite all other operations being in UTF-8 (and with LANG and LC_ALL
being set to UTF-8).

There is a workaround, by manually setting '<?xml version="1.0"
encoding="UTF-8"?>' at the start of any imported XML, but it seems
strange that there isn't anywhere to set this default permanently.

The behaviour of asXML() also seems to vary when printing part of a
SimpleXML object (where it uses UTF-8) rather than the whole document
(where it uses ISO 8859-1).

Adding
putenv('LANG=en_GB.UTF-8');
setlocale(LC_ALL, 'en_GB.UTF-8');
to the script doesn't seem to help.

Reproduce code:
---------------
// manually set encoding to UTF-8
$doc = simplexml_load_string('<?xml version="1.0"
encoding="UTF-8"?><text>umlaut ü here</text>');
print $doc->asXML() . "\n";

// defaults to UTF-8
$doc = simplexml_load_string('<doc><text>umlaut ü here</text></doc>');
print $doc->text->asXML() . "\n\n";

// defaults to ISO 8859-1
$doc = simplexml_load_string('<text>umlaut ü here</text>');
print $doc->asXML() . "\n";

Expected result:
----------------
<?xml version="1.0" encoding="UTF-8"?>
<text>umlaut ü here</text>

<text>umlaut ü here</text>

<?xml version="1.0"?>
<text>umlaut ü here</text>



Actual result:
--------------
<?xml version="1.0" encoding="UTF-8"?>
<text>umlaut ü here</text>

<text>umlaut ü here</text>

<?xml version="1.0"?>
<text>umlaut &#xFC; here</text>



------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=46737&edit=1

Reply via email to