Arno Garrels wrote: > Francois PIETTE wrote: >>> But 3 bytes looks like UTF-8 ? >> >> I don't know. You said it was UTF-16 if not encoded. > > I installed IIS 7 on my Vista box and I found that IIS 7 > uses UTF-7 in directory listings.
Arrgh, typo above, IIS v7 uses UTF-8 of course! > The HTTP header contains > the "charset=UTF-8" content-type extension. > > > However I think the ICS server should continue to use HTML > enitities. > HTML entities represent both iso-8859-1 (Latin1) and Unicode > character numbers (in Unicode the first 256 chars are the same as > Latin1). So in order to create a _valid_ mapping a AnsiString MUST be > converted with current ANSI code page to a UnicodeString/WideString > first! This can be achieved easily in TextToHtmlText() by a local > WideString variable that is assigned parameter Src : String. > Characters above #255 must the be represented as numerical HTML > entities (&#nnnn;). That's all, fully backwards compatible and > works in D2009 as well :) > > -- > Arno Garrels > > >> >> ----- Original Message ----- >> From: "Arno Garrels" <[EMAIL PROTECTED]> >> To: "ICS support mailing" <twsocket@elists.org> >> Sent: Thursday, October 09, 2008 7:03 PM >> Subject: Re: [twsocket] HTML encoding in HttpSrv func. >> TextToHtmlText() >> >> >>> Francois PIETTE wrote: >>>>> The twothird character is not 'encoded' either as "⅔" >>>>> (decimal) or as "⅔" (hex)? If so, IIS sends plain UTF-16! >>>> >>>> Yes, no encoding at all. Just the 3 bytes. So UTF-16. >>> >>> But 3 bytes looks like UTF-8 ? >>> >>> -- >>> Arno Garrels >>> >>>> >>>> -- >>>> [EMAIL PROTECTED] >>>> http://www.overbyte.be >>>> >>>> >>>> ----- Original Message ----- >>>> From: "Arno Garrels" <[EMAIL PROTECTED]> >>>> To: "ICS support mailing" <twsocket@elists.org> >>>> Sent: Thursday, October 09, 2008 5:26 PM >>>> Subject: Re: [twsocket] HTML encoding in HttpSrv func. >>>> TextToHtmlText() >>>> >>>> >>>>> Francois Piette wrote: >>>>>>> Yes, if someone has Apache or a newer IIS installed he could >>>>>>> help. Create a file name with characters not in current ANSI >>>>>>> code page by copy those characters from the Windows application >>>>>>> charmap.exe. Than start a packet sniffer and log a directory >>>>>>> listing. >>>>>> >>>>>> Using IIS6 on W2K3. >>>>> >>>>> Thanks! >>>>> >>>>>> The twothird character (U+2154) is sent in the dirlist as 3 >>>>>> characters : 0xE2 0x85 0x94. In the href link, the 3 characters >>>>>> are expressed as %e2%85%94 >>>>> >>>>> That's UTF-8 URL-encoded. >>>>> >>>>>> while they are binary in the text itself. >>>>> >>>>> The twothird character is not 'encoded' either as "⅔" >>>>> (decimal) or as "⅔" (hex)? If so, IIS sends plain UTF-16! >>>>> >>>>>> There is nothing in the html header to tell which code page or >>>>>> charset is used. -- >>>>> >>>>> Browsers seem to be very good in detecting the correct character >>>>> set nowadays. >>>>> >>>>> -- >>>>> Arno Garrels >>>>> -- >>>>> To unsubscribe or change your settings for TWSocket mailing list >>>>> please goto >>>>> http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit >>>>> our website at http://www.overbyte.be >>> -- >>> To unsubscribe or change your settings for TWSocket mailing list >>> please goto >>> http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our >>> website at http://www.overbyte.be -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be