Arno Garrels wrote:
> Francois PIETTE wrote:
>>> But 3 bytes looks like UTF-8 ?
>> 
>> I don't know. You said it was UTF-16 if not encoded.
> 
> I installed IIS 7 on my Vista box and I found that IIS 7
> uses UTF-7 in directory listings. 

Arrgh, typo above, IIS v7 uses UTF-8 of course!    

> The HTTP header contains
> the "charset=UTF-8" content-type extension.
> 
> 
> However I think the ICS server should continue to use HTML
> enitities.
> HTML entities represent both iso-8859-1 (Latin1) and Unicode
> character numbers (in Unicode the first 256 chars are the same as
> Latin1). So in order to create a _valid_ mapping a AnsiString MUST be
> converted with current ANSI code page to a UnicodeString/WideString
> first! This can be achieved easily in TextToHtmlText() by a local
> WideString variable that is assigned parameter Src : String.
> Characters above #255 must the be represented as numerical HTML
> entities (&#nnnn;). That's all, fully backwards compatible and
> works in D2009 as well :)
> 
> --
> Arno Garrels
> 
> 
>> 
>> ----- Original Message -----
>> From: "Arno Garrels" <[EMAIL PROTECTED]>
>> To: "ICS support mailing" <twsocket@elists.org>
>> Sent: Thursday, October 09, 2008 7:03 PM
>> Subject: Re: [twsocket] HTML encoding in HttpSrv func.
>> TextToHtmlText()
>> 
>> 
>>> Francois PIETTE wrote:
>>>>> The twothird character is not 'encoded' either as "&#8532;"
>>>>> (decimal) or as "&#x2154;" (hex)? If so, IIS sends plain UTF-16!
>>>> 
>>>> Yes, no encoding at all. Just the 3 bytes. So UTF-16.
>>> 
>>> But 3 bytes looks like UTF-8 ?
>>> 
>>> --
>>> Arno Garrels
>>> 
>>>> 
>>>> --
>>>> [EMAIL PROTECTED]
>>>> http://www.overbyte.be
>>>> 
>>>> 
>>>> ----- Original Message -----
>>>> From: "Arno Garrels" <[EMAIL PROTECTED]>
>>>> To: "ICS support mailing" <twsocket@elists.org>
>>>> Sent: Thursday, October 09, 2008 5:26 PM
>>>> Subject: Re: [twsocket] HTML encoding in HttpSrv func.
>>>> TextToHtmlText()
>>>> 
>>>> 
>>>>> Francois Piette wrote:
>>>>>>> Yes, if someone has Apache or a newer IIS installed he could
>>>>>>> help. Create a file name with characters not in current ANSI
>>>>>>> code page by copy those characters from the Windows application
>>>>>>> charmap.exe. Than start a packet sniffer and log a directory
>>>>>>> listing.
>>>>>> 
>>>>>> Using IIS6 on W2K3.
>>>>> 
>>>>> Thanks!
>>>>> 
>>>>>> The twothird character (U+2154) is sent in the dirlist as 3
>>>>>> characters : 0xE2 0x85 0x94. In the href link, the 3 characters
>>>>>> are expressed as %e2%85%94
>>>>> 
>>>>> That's UTF-8 URL-encoded.
>>>>> 
>>>>>> while they are binary in the text itself.
>>>>> 
>>>>> The twothird character is not 'encoded' either as "&#8532;"
>>>>> (decimal) or as "&#x2154;" (hex)? If so, IIS sends plain UTF-16!
>>>>> 
>>>>>> There is nothing in the html header to tell which code page or
>>>>>> charset is used. --
>>>>> 
>>>>> Browsers seem to be very good in detecting the correct character
>>>>> set nowadays.
>>>>> 
>>>>> --
>>>>> Arno Garrels
>>>>> --
>>>>> To unsubscribe or change your settings for TWSocket mailing list
>>>>> please goto
>>>>> http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit
>>>>> our website at http://www.overbyte.be
>>> --
>>> To unsubscribe or change your settings for TWSocket mailing list
>>> please goto
>>> http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our
>>> website at http://www.overbyte.be 
-- 
To unsubscribe or change your settings for TWSocket mailing list
please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket
Visit our website at http://www.overbyte.be

Reply via email to