ID:               34591
 User updated by:  priit at ww dot ee
 Reported By:      priit at ww dot ee
 Status:           Bogus
 Bug Type:         XML related
 Operating System: Windows XP SP2
 PHP Version:      5.0.5
 New Comment:

charset has nothing to do with it. I tried converting the input to
latin1 first and utf8 and changed the xml_parser_create charset
accordingly, but the result was the same always! the characters in my
example xml are compatible with ISO-8859-1 although the xml input it
sayd to be ISO-8859-15 ... if I changed the xml itself also to
something else the result was still the same.


Previous Comments:
------------------------------------------------------------------------

[2005-09-21 23:35:56] priit at ww dot ee

<?php
$file = 'failid/kontakt.xml';

function startElement($parser, $name, $attrs)
    {
    global $temp_name;
    $temp_name = $name;
    }

function endElement($parser, $name)
    {
    global $temp_name,$temp_value,$moodul;
    if($temp_name==$name) {$moodul .= "($temp_name : $temp_value)\n";}
    }

function characterData($parser, $data)
    {
    global $temp_value;
    $temp_value = $data;
    }

$moodul = '<PRE>';
$xml_parser = xml_parser_create('ISO-8859-1');
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");
if (!($fp = fopen($file, "r"))) { die("could not open XML input"); }
$data = fread($fp, filesize($file));
if(!xml_parse($xml_parser, $data, feof($fp)))
    {
    die(sprintf("XML error: %s at line %d",
    xml_error_string(xml_get_error_code($xml_parser)),
    xml_get_current_line_number($xml_parser)));
    }

xml_parser_free($xml_parser);
$moodul .= '</PRE>';

echo $moodul;

?>

------------------------------------------------------------------------

[2005-09-21 23:17:34] [EMAIL PROTECTED]

Try using compatible charset.


------------------------------------------------------------------------

[2005-09-21 23:15:36] priit at ww dot ee

using php code:
$xml_parser = xml_parser_create('ISO-8859-1');
xml_set_element_handler($xml_parser, "startElement", "endElement");
xml_set_character_data_handler($xml_parser, "characterData");

$data = fread($fp, filesize($file));
if(!xml_parse($xml_parser, $data, feof($fp)))
    {
    die(sprintf("XML error: %s at line %d",
    xml_error_string(xml_get_error_code($xml_parser)),
    xml_get_current_line_number($xml_parser)));
    }

xml_parser_free($xml_parser);

------------------------------------------------------------------------

[2005-09-21 23:04:27] [EMAIL PROTECTED]

Thank you for this bug report. To properly diagnose the problem, we
need a short but complete example script to be able to reproduce
this bug ourselves. 

A proper reproducing script starts with <?php and ends with ?>,
is max. 10-20 lines long and does not require any external 
resources such as databases, etc.

If possible, make the script source available online and provide
an URL to it here. Try to avoid embedding huge scripts into the report.



------------------------------------------------------------------------

[2005-09-21 22:59:36] priit at ww dot ee

Description:
------------
XML parsing seems to skip certain info. specially data in fields before
öäüõ (non-american-letters) etc.

The same code has worked perfectly on any PHP 4.1+ (well until 4.3.6 or
7 I have tested it) versions.

Reproduce code:
---------------
<?xml version="1.0" encoding="ISO-8859-15"?>
  <rida>
   <tootaja_id>519</tootaja_id>
   <eesnimi>XxX</eesnimi>
   <perekonnanimi>YyY</perekonnanimi>
   <synniaeg>21.02.1900</synniaeg>
   <aadress>Põllu 1202-2, 10920 Tallinn</aadress>
   <haridustase>kõrgharidus</haridustase>
   <eriala>kaubandusökonoomika</eriala>
   <telefon>625 7700</telefon>
   <e_post>[EMAIL PROTECTED]</e_post>
   <ametijuhend_viit></ametijuhend_viit>
   <asutus>Tööturuamet</asutus>
   <yksus_nimetus>Juhtkond</yksus_nimetus>
   <yksus_id>10</yksus_id>
   <prioriteet>1</prioriteet>
   <on_peatumine>0</on_peatumine>
  </rida>

Expected result:
----------------
aadress => Põllu 1202-2, 10920 Tallinn
haridustase => kõrgharidus
eriala => kaubandusökonoomika
asutus => Tööturuamet

(skipped the lines that were parsed fine!)

PHP code I use is from php homepage sample or something...

Actual result:
--------------
aadress => õllu 1202-2, 10920 Tallinn
haridustase => õrgharidus
eriala => ökonoomika
asutus => ööturuamet




------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=34591&edit=1

Reply via email to