[PHP-DEV] quite a big bug in DOM

Ron Korving Fri, 16 Sep 2005 01:07:13 -0700

Hi,

I found a bug in DOM. It surprises me that it's never been seen and/or fixed
before. I can't find anything about in the PHP bugtracker anyway. The reason
why I'm posting here and not writing a bugreport, is because I'm not sure if
this is a problem in the PHP-extension or the DOM-library itself. In the
latter case there's nothing anybody here can do, I guess.


This is the situation:

<?php
  $doc = DOMDocument::loadHTML('<html><body>&nbsp;</body></html>');
  echo "'".$doc->getElementsByTagName('body')->item(0)->textContent."'\n";

  $doc = DOMDocument::loadHTML('<html><body>foo&nbsp;bar</body></html>');
  echo "'".$doc->getElementsByTagName('body')->item(0)->textContent."'\n";
?>

Output:

'Â '
'fooÂ bar'

Where the heck do these 'Â's come from when it parses an &nbsp; ? I hope
anyone can shed some light on the next step to be taken in order to fix
this.

Thanks,

Ron Korving

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

[PHP-DEV] quite a big bug in DOM

Reply via email to