You really need to know the encoding you are working with. Check if the page
has a charset attribute first and if it does re-encode to utf8 first. The try
it in mergJSON. If it chokes then the best you can do is replace any char
greater than charToNum(127) with “?”. Other than that I think there
Yes - it's taken from the wild (an HTML page on the internet). Then turned
into XML, then a table extracted etc - so looks to me like non-utf8 stuff
has go in there somewhere.
That's why I was wandering if there was a way to filter out arbitrary text
and make it utf8-safe. You know urlencode for u
> On 24 Jul 2015, at 7:22 am, David Bovill wrote:
>
> I'm placing the text into an array and then using Monte's mergJsonEncode
> function to decode it. Usually works fine - but in this case it looks like
> the content needs some tidying before I put it into the array.
mergJSON will choke on any
Any tricks to ensure that text I receive from an internet (HTML) source -
destined to be placed into a nice pretty JSON wrapper is safe to go? At the
moment it is bugging out somewhere.
I'm placing the text into an array and then using Monte's mergJsonEncode
function to decode it. Usually works fi