Hi,

Thursday, January 15, 2004, 3:07:02 AM, you wrote:
RS> Hello,

RS> This question may border on OT...

RS> I have a web form where visitors must enter large amounts of text at one
RS> time (text area).  Once submitted, the large amount of text is stored as
RS> a CLOB in an Oracle database.

RS> Some of my visitors create their text in Ms-Word and then cut and paste
RS> it into the text area and then submit the form.

RS> When I retrieve it from the database, I do a stripslahses, htmlentities
RS> and nl2br in that order to preserve the format of the submitted test.
RS> When I view this text, single or double quotes show up as little white
RS> square blocks.  I've tested this out with MS-Word on a windows machine
RS> and a mac machine.  Same thing happens with either OS.  This only 
RS> happens when they cut and paste from MS-Word into the text area.  If
RS> they type text into the text area directly, everything is fine...

RS> I know I can search through their submitted text and swap out the 
RS> unrecognized character and insert the proper one.  I just don't know
RS> what to look for as being the unrecognized character.

RS> I've googled all over looking at ascII charts and keyboard maps. 
RS> Nothing mentions MS-Word specific information though.

RS> Anyone out there dealt with this before?

RS> Thanks,
RS> R


The quotes are actually a sequence of three bytes with values like

226 128 156
226 128 157

for the 2 quotes

here is a bit of code to fix them and a few others, I would be
interested if anyone knew the complete set of these weirdos :)

$crap = 
array(chr(226).chr(128).chr(147),chr(226).chr(128).chr(156),chr(226).chr(128).chr(157),chr(226).chr(128).chr(153));
$clean = array('-','"','"',"'");
$content = str_replace($crap,$clean,$text);

-- 
regards,
Tom

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to