or "How I Learned To Stop Breathing and Oxygenate By Osmosis"...

Hello,

I'm having a problem, hopefully the experts in this group can help me
around it.

I read in the contents of an old mysql database, and made XML files out
of the data contained in it. The total job came out to about 6800 files,
or "Documents".

I'm finding that, sporadically, when reading these XML files, and passing
the xmldata to sablotron, I'm getting sablotron errors. These errors stem
from characters I'm finding throughout these documents. This is the meat
of my problem.

I take a look at a file which is giving errors by using 'less'. When
looking at the file, I'll see chars like this: <A1> or <91> or <92> and
so on, and so on. The chars are hilighted in bold reverse text,
indicating that they are 'binary'??. I'm not sure whether to call them
binary or hex... perhaps someone can tell me how to appropriately address
these chars...

Anyhow... these chars sometimes correspond to valid characters. Such as
<A9>... this is a "copyright" char, or &amp;copy.

I've been manually replacing these characters as errors are generated,
but it's getting a little tiring.

Is there anyway I can force PHP to either strip out these 'binary'
characters, or whatever they are, when I read the file?

Is there any way to keep php from saving these chars to NEW documents
when they are created?

Does anyone even know what I'm talking about?? hehe.

Thanks for any help, I hope to hear from someone about this.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to