On 19 Aug 2013, at 17:04, Jonathan Lynch <jonathandly...@gmail.com> wrote:
> This is just the strangest thing. On some websites - but not all - trying > to get the html of that website using "get url" or "put url" is causing > some characters to be substituted. > > These are not obscure unicode characters. They seem to be characters in the > upper ANSI set. > > For example, on this web page: > http://emergency.cdc.gov/disasters/wildfires/facts.asp > > If I use the following code: > > put URL "http://emergency.cdc.gov/disasters/wildfires/facts.asp" into field > 1 > > The right single quote character --> ’ <-- ( which is character number 146) > gets converted into ’ > > > I do not understand why ’ becomes ’ > Jonathan, The page source for the url indicates the page is encoded as UTF-8. This is from the 'head' section of the page. <meta http-equiv="content-type" content="text/html; charset=utf-8" /> So it looks like it may be 'obscure unicode characters'. :-) What happens when you do something like this: put URL "http://emergency.cdc.gov/disasters/wildfires/facts.asp" into tTemp put uniDecode(uniEncode(tTemp, "UTF8")) into field 1 Cheers Dave Cragg _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode